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INTRODUCTION: 


A.  Foreword: 

As  described  in  the  previous  report,  and  in  accordance  with  the  approved 
amended  statement  of  work  for  this  project,  we  have  concentrated  our  efforts  on 
understanding  basic-helix-loop-helix  protein  DNA  binding  specificity,  and  on 
elucidating  the  functions  of  TISll/TTP  proteins.  Each  section  below  will  be  divided  into 
two  parts,  each  of  which  will  correspond  to  one  of  these  projects. 

B.  Determinants  of  bHLH  protein  DNA  binding  specificity: 

A  large  family  of  transcriptional  regulators  is  defined  by  the  basic-helix-loop- 
helix  (bHLH)  motif  (Murre  et  al.,  1989),  in  which  a  DNA-binding  basic  region  (BR)  lies 
immediately  amino  terminal  to  the  HLH  dimerization  segment  (Davis  et  al.,  1990; 

Murre  et  al.,  1989;  Voronova  and  Baltimore,  1990).  In  metazoans,  bHLH  proteins  are 
involved  in  specification  of  multiple  cell  types  (Lee,  1997;  Olson  and  Klein,  1994; 
Weintraub  et  al.,  1991).  Some  bHLH  family  members  function  as  homodimers,  but 
others  appear  to  act  together  with  a  heterodimeric  partner  (Weintraub  et  al.,  1991).  For 
example,  the  closely-related  bHLH  proteins  that  mediate  myogenic  differentiation, 
including  MyoD,  are  thought  to  function  as  heterodimers  with  E  proteins,  a  widely- 
expressed  bHLH  protein  subgroup  that  is  exemplified  by  the  E2A  proteins 
(Chakraborty  et  al.,  1991;  Davis  et  al.,  1990;  Lassar  et  al.,  1991;  Neuhold  and  Wold, 

1993).  Most  bHLH  protein  dimers  bind  to  the  consensus  CANNTG  (the  E  box),  with 
each  respective  BR  binding  to  a  half  site  (Blackwell  and  Weintraub,  1990;  Ellenberger  et 
al.,  1994;  Ferre-D'  Amare  et  al.,  1993;  Ferre-D'Amare  et  al.,  1994;  Ma  et  al.,  1994;  Parraga 
et  al.,  1998;  Shimizu  et  al.,  1997).  Given  the  many  regulatory  processes  in  which  bHLH 
proteins  are  involved,  the  apparent  simplicity  of  the  CANNTG  consensus  raises  the 
important  question  of  how  different  bHLH  proteins  act  only  on  appropriate  target 
genes  (Weintraub  et  al.,  1991). 

In  part,  the  specificity  with  which  bHLH  proteins  function  derives  from 
preferential  recognition  of  different  classes  of  CANNTG  sites  by  different  bHLH 
protein  subgroups.  The  HLH  segment  consists  of  a  parallel,  left-handed,  four  helix 
bundle  (Fig.  1)  (Ellenberger  et  al.,  1994;  Ferre-D'  Amare  et  al.,  1993;  Ferre-D'Amare  et 
al.,  1994;  Ma  et  al.,  1994;  Parraga  et  al.,  1998;  Shimizu  et  al.,  1997).  TTie  BR  is 
rmstructured  in  solution  (Anthony-Cahill  et  al.,  1992),  but  when  bound  to  DNA  it 
extends  N-terminally  from  the  HLH  segment  as  an  _-helix  that  crosses  the  major 
groove  (Fig.  1).  Crystallographic  analyses  have  revealed  some  differences  in  how  these 
proteins  bind  DNA.  For  example,  in  Myc-family  and  related  bHLH  proteins,  an 
arginine  (Arg)  residue  at  BR  position  13  (Fig.  2)  specifies  recognition  of  CACGTG  sites 
(Blackwell  et  al.,  1993;  Dang  et  al.,  1992;  Halazonetis  and  Kandil,  1992;  Van  Antwerp  et 
al.,  1992)  by  contacting  bases  in  the  center  (Ferre-D'  Amare  et  al.,  1993;  Ferre-D'Amare 
et  al.,  1994;  Shimizu  et  al.,  1997).  However,  it  still  is  not  imderstood  how  bHLH  proteins 
which  have  a  different  amino  acid  at  BR  position  13  (Fig.  2)  bind  preferentially  to 
distinct  CANNTG  sites  (Blackwell  and  Weintraub,  1990;  Dang  et  al.,  1992),  or  how 
bHLH  proteins  establish  differences  in  flanking  sequence  selectivity  (Blackwell  and 
Weintraub,  1990;  Fisher  et  al.,  1993;  Gould  and  Bresnick,  1998)  that  can  be  of  biological 
importance  (Aksan  and  Goding,  1998;  Jennings  et  al.,  1999). 

Many  bHLH  proteins  that  lack  R^j,  including  MyoD  and  other  E2A  partners  (Fig. 
2),  can  bind  to  similar  DNA  sequences  in  vitro  but  act  on  different  tissue-specific  genes 
(Weintraub  et  al.,  1991).  Cooperative  or  inhibitory  relationships  with  other 
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transcriptional  regulators  might  contribute  to  this  specificity  (Lemercier  et  al.,  1998; 
Molkentin  and  Olson,  1996;  Postigo  and  Dean,  1999;  Weintraub  et  al.,  1994),  but  it  is  not 
likely  to  derive  entirely  from  other  lineage-specific  factors,  because  MyoD  can  induce 
myogenesis  in  many  different  cell  types  (Weintraub  et  al.,  1991).  Initiation  of 
myogenesis  by  MyoD  and  other  myogenic  bHLH  proteins  depends  upon  three 
residues  that  are  located  within  the  BR  and  the  BR-HLH  junction  ( A5,  Tg,  and  Figs.  1 
and  2).  These  "myogenic"  residues  are  not  essential  for  binding  a  muscle-specific  site  in 
vitro  or  in  vivo,  suggesting  that  they  are  involved  in  other  critical  interactions  (Brennan 
et  al.,  1991;  Davis  et  al,  1990;  Davis  and  Weintraub,  1992;  Schwarz  et  al.,  1992; 

Weintraub  et  al.,  1991).  These  interactions  have  been  proposed  to  involve  distinct  co¬ 
factors  (Brennan  et  al.,  1991;  Davis  et  al.,  1990;  Weintraub  et  al.,  1991),  and  the 
immasking  of  an  activation  domain  in  MyoD  or  the  myogenic  co-factor  MEF2  (Bengal 
et  al.,  1994;  Black  et  al.,  1998;  Huang  et  al.,  1998;  Weintraub  et  al.,  1991).  In  the  MyoD- 
DNA  structure,  K^g  is  oriented  away  from  the  DNA,  but  Ag  and  Tg  face  the  major 
groove  and  could  not  contact  other  proteins  directly  (Ma  et  al.,  1994)  (Fig.  1).  However, 
the  latter  two  residues  could  influence  protein-protein  interactions  indirectly,  by 
affecting  how  the  BR  helix  is  positioned  on  the  DNA  (Ma  et  al.,  1994).  Although 
substitutions  at  these  positions  might  not  substantially  impair  binding  to  particular 
CANNTG  sites,  it  is  important  to  determine  whether  they  might  have  more  subtle 
influences  on  sequence  specificity  that  could  reflect  conformational  effects. 

We  have  determined  that  the  myogenic  residues  Ag  and  Tg  establish  the 
characteristic  MyoD  sequence  preference,  which  includes  a  CAGCTG  core.  Individual 
substitutions  at  these  BR  positions  simultaneously  alter  preferences  for  multiple  bases 
that  MyoD  does  not  contact  directly  (Ma  et  al.,  1994),  indicating  that  these  preferences 
are  determined  indirectly,  by  how  the  BR  helix  is  positioned  on  the  DNA.  This 
mechanism  is  distinct  from  the  standard  model  for  sequence  specificity,  in  which 
preferred  bases  are  contacted  directly  (Pabo  and  Sauer,  1992;  Steitz,  1990).  The 
corresponding  BR  residues  are  also  required  for  the  sequence  preferences  of  E2A 
proteins,  which  can  recognize  either  of  two  distinct  half  sites  depending  upon  their 
dimerization  partner.  E2A  homodimers  and  E2A  +  MyoD  heterodimers  bind  to 
asymmetric  sites  that  include  a  CAGCTG  core.  In  contrast,  as  a  heterodimer  with  the 
bHLH  protein  Twist,  E2A  binds  preferentially  to  half  of  the  symmetric  sequence 
CATATG.  The  preference  of  E2A  for  the  former  asymmetric  sites  depends  not  only 
upon  the  BR  sequence,  but  also  upon  BR  positioning  that  involves  the  junction  region. 
An  analysis  of  DNA  binding  by  MyoD  and  E2A  jimction  and  BR  mutants  indicates  that  a 
MyoD-like  sequence  specificity  is  associated  with,  but  not  sufficient  for,  myogenesis. 
This  supports  the  model  that  the  BR-junction  region  is  also  involved  in  other  critical 
interactions.  The  results  suggest  that  E2A  and  its  partner  bHLH  proteins  bind  DNA  by 
adopting  a  limited  number  of  preferred  BR  conformations,  each  of  which  is  associated 
with  a  characteristic  DNA  sequence  preference.  They  also  predict  that  binding  of  co¬ 
factors  to  the  MyoD  BR  might  be  influenced  by  how  it  is  positioned  on  the  DNA,  and 
are  consistent  with  the  idea  that  relatively  subtle  differences  in  binding  sequence 
recognition  can  modulate  bHLH  protein  activity  in  vivo. 

C.  TTP/TISll  proteins  and  cell  survival; 

Complex  programs  of  gene  expression  ensue  when  quiescent  cells  are  stimulated 
by  growth  factors  to  enter  the  cell  cycle  (Iyer  et  al.,  1999).  The  "immediate-early"  genes 
are  induced  directly  by  many  stimuli,  and  during  some  apoptotic  events.  Their 
products  include  transcription  factors  associated  with  proliferation  and  apoptosis 
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(Hafezi  et  al.,  1997;  Shi  et  al,  1992;  Zhan  et  al.,  1997).  In  addition,  during  these 
responses  the  localization,  stability,  and  translation  of  specific  mRNAs  are  affected 
(Brown  and  Schreiber,  1996;  Chen  and  Shyu,  1995),  indicating  that  post-transcriptional 
regulatory  mechanisms  are  also  involved. 

The  immediate  early  protein  tristetraprolin  (TTP;  also  known  as  TISll,  Nup475, 
and  G0S24)  is  induced  transiently  in  various  cell  types  by  diverse  stimuli,  and  during 
regeneration  of  certain  tissues  (DuBois  et  al.,  1990;  Lai  et  al.,  1990;  Varnum  et  al.,  1989; 
Worthington  et  al.,  1996).  TTP  is  closely  related  to  the  TISllb  and  TISlld  proteins, 
particularly  within  tandem  Cys-Xg-Cys-Xg-Cys-Xg-His  (CySgHis)  zinc  fingers  (Vamum 
et  al.,  1991).  Each  of  these  three  proteins,  which  we  refer  to  as  the  TTP/TISll  proteins, 
is  induced  rapidly  by  multiple  different  agents,  although  they  vary  with  respect  to  their 
baseline  mRNA  levels  and  induction  by  particular  stimuli  (Corps  and  Brown,  1995; 
Gomperts  et  al.,  1992;  Varnum  et  al.,  1991).  Other  CySgHis  zinc  finger  proteins  are 
involved  in  mRNA  binding,  cleavage,  or  processing,  or  have  been  implicated  in  post- 
transcriptional  gene  regulation  (Barabino  et  al.,  1997;  Batchelder  et  al.,  1999;  Guedes  and 
Priess,  1997;  Murray  et  al.,  1997;  Rudner  et  al.,  1998;  Tabara  et  al.,  1999;  Tronchere  et  al., 
1997)  predicting  that  TTP/TISll  proteins  also  perform  mRNA-associated  functions. 

Although  TTP  is  induced  in  a  variety  of  contexts,  mice  in  which  the  TTP  gene  has 
been  disrupted  (TTP  -/-  mice)  have  a  limited  phenotype.  They  are  normal  at  birth,  but 
later  develop  a  systemic  inflammatory,  arthritic,  and  myeloproliferative  syndrome 
which  is  mediated  by  the  cytokine  tumor  necrosis  factor-alpha  (TNF-a),  and  derives 
from  an  abnormality  in  non-lymphoid  hematopoietic  cells  (Carballo  et  al.,  1997; 
Carballo  et  al.,  1998;  Taylor  et  al,  1996).  When  stimulated  in  vitro,  TTP  -/-  macrophages 
produce  moderately  elevated  levels  of  TNF-a  protein  and  mRNA,  the  half-life  of  which 
is  prolonged.  TNF-a  and  many  cytokine  and  growth  factor-induced  mRNAs  are 
regulated  post-transcriptionally  through  AU-rich  elements  (AREs)  in  their  3' 
untranslated  regions  (Chen  and  Shyu,  1995;  Ross,  1995;  Shaw  and  Kamen,  1986).  When 
TTP  is  over-expressed,  TNF-a  and  other  cytokine  mRNAs  are  deadenylated  and 
destabilized,  and  binding  of  TTP  to  the  TNF-a  ARE  can  be  detected  readily  (Carballo  et 
al.,  1998;  Lai  et  al.,  1999).  These  observations,  and  the  finding  that  TTP  is  induced  by 
TNF-a,  have  suggested  the  model  that  TTP  normally  destablizes  the  TNF-a  mRNA 
directly,  through  a  feedback  mechanism  (Carballo  et  al.,  1998;  Lai  et  al.,  1999).  It  is 
intriguing,  however,  that  TNF-a  mRNA  levels  are  also  decreased  by  low-level  TTP 
expression  but  increased  by  intermediate  TTP  amounts  (Lai  et  al.,  1999),  suggesting  that 
TTP  may  have  complex  and  apparently  indirect  effects  on  TNF-a  expression. 

Although  TTP/TISll  proteins  are  evolutionarily  conserved  and  induced  by 
numerous  extracellular  stimuli,  suggesting  a  broader  role,  no  other  functions  of 
metazoan  TTP/TISll  proteins  have  been  described.  TTP  is  induced  during  apoptosis, 
however,  in  response  to  the  breast  cancer  susceptibility  protein  BRCAl  (Harkin  et  al., 
1999),  and  withdrawal  of  growth  factors  from  neuronal  cells  (Mesner  et  al.,  1995).  In  S. 
pombe,  a  related  protein  is  required  for  effective  transmission  of  a  pheromone-induced 
rfls/ mitogen-activated  protein  kinase  signal  (Kanoh  et  al.,  1995).  In  addition,  a  niml 
cdc25  mutant  can  be  complemented  by  either  the  cdc2  kinase  or  a  TTP/TISll  gene, 
suggesting  a  cell  cycle  effect  (Warbrick  and  Glover,  1994).  A  TTP/TISll-related  protein 
in  S.  cerevisiae  is  required  for  normal  metabolism,  and  retards  cell  growth  when 
overexpressed  (Ma  and  Herschman,  1995;  Thompson  et  al.,  1996).  These  observations 
indicate  that  TTP/TISll  proteins  might  influence  pathways  regulatory  that  regulate 
survival,  differentiation,  or  proliferation. 
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If  TTP  acts  on  such  pathways  when  it  is  induced  transiently,  its  continuous 
expression  might  be  predicted  to  affect  cell  growth  or  viability.  Supporting  this  idea, 
we  report  that  continuous  TTP  expression  causes  various  cell  types  to  undergo 
apoptotic  cell  death.  This  response  occurs  at  TTP  expression  levels  which  are 
comparable  to  those  attained  transiently  during  serum  stimulation.  Each  TTP/TISll 
protein  stimulates  apoptosis  with  similar  frequency  and  timing,  and  by  various  criteria 
this  cell  death  appears  analogous  to  apoptosis  induced  by  oncoproteins  such  as  E2F-1, 
or  the  immediate  early  protein  c-Myc.  In  addition,  TTP  differs  from  TISllb  and  TISlld 
in  that,  like  both  E2F-1  and  c-Myc,  it  appears  to  sensitize  cells  to  induction  of  apoptosis 
by  TNF-a.  The  data  indicate  that  TTP/TISll  proteins  generally  act  similarly  on  growth 
or  survival  pathways,  but  also  that  TTP  may  have  a  distinct  influence  on  responses  to 
TNF-a.  They  suggest  that  the  role  of  TTP  in  TNF-a  regulation  might  be  complex,  and 
imply  that  in  TTP  -/-  mice  some  TTP  functions  may  be  compensated  for  by  other 
mechanisms. 


BODY: 


A.  Determinants  of  bHLH  protein  DNA  binding  specificity: 

Myogenic  BR  residues  and  MyoD  DNA  binding  preferences.  Identification  of  the 
myogenic  BR  residues  stemmed  originally  from  studies  in  which  the  MyoD  BR  was 
replaced  with  that  of  E12,  a  product  of  the  alternatively-spliced  E2A  gene  (Murre  et  al., 
1989).  This  MyoD  mutant  (MD(E12B);  Fig.  2)  binds  to  a  muscle-specific  regulatory  site 
as  a  heterodimer  with  E2A  proteins  either  in  vitro  or  in  vivo,  but  cannot  induce 
myogenesis  in  a  cell  culture  assay  or  activate  transcription  through  a  muscle-specific 
enhancer  (Davis  et  al.,  1990;  Weintraub  et  al.,  1991).  Re-substitution  of  the  "myogenic" 
residues  As  and  Te  (Fig.  2)  into  MD(E12B)  restores  its  activity  in  these  functional  assays 
(Weintraub  et  al.,  1991).  Similar  results  are  obtained  when  As  and  Ts  are  mutated 
within  MyoD  (Davis  and  Weintraub,  1992;  Huang  et  al.,  1998;  Weintraub  et  al.,  1991), 
and  when  analogous  substitutions  are  made  in  the  context  of  the  myogenic  bHLH 
protein  Myogenin  (Brennan  et  al.,  1991).  These  experiments  implicate  As  and  T6  in 
mechanisms  that  are  of  functional  importance,  but  not  essential  for  binding  to  a 
particular  muscle-specific  DNA  sequence. 

We  employed  an  in  vitro  selection  strategy  (Blackwell  and  Weintraub,  1990)  to 
test  whether  such  mutations  might  have  more  subtle  effects  on  how  MyoD  binds 
specifically  to  DNA.  To  identify  sequences  to  which  these  mutants  bind  preferentially, 
we  used  sequence  libraries  in  which  only  positions  within  and  flanking  the  CANNTG 
consensus  are  randomized  (Fig.  3A),  so  that  the  position  of  bHLH  protein  binding 
along  the  DNA  is  fixed.  This  strategy  makes  it  possible  to  sequence  the  selected  sites  as 
a  pool,  and  thereby  to  analyze  a  very  large  population  of  selected  sites  simultaneously 
(Blackwell  et  al.,  1990;  Blackwell  and  Weintraub,  1990).  It  reveals  the  relative 
preferences  for  individual  bases  at  each  site  position,  and  can  detect  subtle  differences 
that  might  not  be  identified  through  more  conventional  approaches. 

This  assay  has  previously  shown  that  the  preferred  MyoD  binding  consensus  is 
G  / AAC AGCTGTT  /C  (Figs.  3B  and  3C),  and  that  the  E2A  proteins  E12  and  E47  overlap 
considerably  with  MyoD  in  their  binding  properties,  but  prefer  sites  that  have  an 
asymmetric  CACCTG  core  sequence  (Fig.  3C)  (Blackwell  and  Weintraub,  1990). 
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However,  in  contrast  to  either  of  these  proteins,  the  MD(E12B)  mutant  prefers  the 
sequence  G/ACCATATGGT/C.  which  differs  from  the  MyoD  preferred  site  over  the  8 
central  base  pairs,  and  contains  the  distinct  core  sequence  CATATG  (Figs.  3B  and  3C). 
This  sequence  and  related  elements  are  normally  targeted  by  the  bHLH  protein  Twist, 
an  E  protein  partner  that  is  involved  in  mesodermal  cell  fate  specification  (Cripps  et  al., 
1998;  Harfe  et  al.,  1998;  Michelson,  1996;  Szymanski  and  Levine,  1995;  Yin  et  al.,  1997) 
(Fig.  2).  Back-substitution  of  A5  of  MyoD  into  MD(E12B),  which  is  not  sufficient  for 
myogenic  activity  in  cell  culture  assays  (Weintraub  et  al.,  1991),  results  in  preferences 
that  are  slightly  more  similar  to  those  of  MyoD  at  positions  +4,  (MD(E12B-A);  Figs.  2, 
3B,  and  3CX  However,  introduction  of  both  A5  and  Tg,  which  restores  myogenesis 
(Brennan  et  al.,  1991;  Weintraub  et  al.,  1991),  results  in  preferences  across  the  entire  site 
that  are  indistinguishable  from  those  of  MyoD  (MD(E12B-AT);  Figs.  2, 3B,  and  3C). 

To  determine  whether  these  sequence  preferences  reflect  significant  differences 
in  binding  affinity  and  specificity,  we  compared  binding  of  these  proteins  to  individual 
oligonucleotides  that  correspond  to  the  MyoD  and  Twist  preferences,  and  differ  only  at 
positions  within  and  adjacent  to  the  CANNTG  consensus  (Fig.  3D).  Supporting  the  in 
vitro  selection  findings,  both  MyoD  and  MyoD(E12B-AT)  homodimers  bound  with 
higher  affinity  to  the  preferred  MyoD  site  than  to  the  Twist  site  (Fig.  3D,  lanes  1, 4,  5, 
and  8).  In  contrast,  the  Twist  site  was  preferred  by  MD(E12B)  and,  to  a  lesser  extent, 
MyoD(E12B-A)  (Fig.  3D,  lanes  2, 3, 6,  and  7).  In  a  binding  competition  assay,  specific 
DNA  binding  by  MD(Ei2B-AT)  was  competed  much  more  effectively  by  the  MyoD  site 
(Fig.  4A,  lanes  4, 7, 10, 13,  and  16),  and  binding  by  either  MD(E12B)  or  MD(E12B-A)  was 
competed  better  by  the  Twist  site  (Fig.  4B,  lanes  2, 3, 8, 9, 14,  and  15).  A  c-Myc 
preferred  site  (CACGTG;  not  shown)  was  a  relatively  poor  competitor  of  binding  by 
each  of  these  proteins  (Figs.  4A  and  B,  lanes  17-19).  The  data  show  that  introduction  of 
Ag  and  T^  into  MD(E12B)  restores  not  only  myogenic  activity  (Fig.  2)  but  also  the  MyoD 
DNA  binding  preference.  This  substitution  affects  sequence  recognition  across  4  bp 
within  each  half-site  (Figs.  3A  and  B),  indicating  a  global  effect  on  how  the  BR  helix  is 
positioned  on  the  DNA.  The  finding  that  MD(E12B)  is  distinct  from  either  MyoD  or  E12 
in  its  binding  sequence  preference  also  indicates  that  DNA  recognition  by  an  E2A  BR 
can  be  profoundly  influenced  by  its  molecular  context. 

Influence  of  BR  positioning  on  MyoD/E2A  and  Twist/E2A  heterodimer  sequence 
preferences.  Twist  and  E2A  proteins  appear  to  cooperate  in  vivo  to  regulate 
transcription  through  CATATG  sites  (Harfe  et  al.,  1998),  suggesting  that  the  DNA 
sequence  recognition  properties  of  E2A  might  be  altered  by  heterodimerization  with 
Twist.  However,  an  alternative  possibility  is  that  functional  Twist  +  E2A  recognition 
sites  are  distinct  from  their  in  vitro  binding  preference  (Huang  et  al.,  1996).  To  address 
this  question,  we  performed  in  vitro  selection  on  Twist  +  E12  complexes.  Twist 
homodimers  and  Twist  +  E12  heterodimers  both  preferred  sites  that  contain  the  core 
sequence  CATATG  (Figs.  5A  and  B).  They  were  similar  to  MD(E12B)  and  especially  to 
MD(E12B-A)  in  their  preferences  at  ±4,  but  selected  MyoD-like  sequences  at  ±5,  (Figs.  3B 
and  C,  5 A  and  B).  The  symmetry  of  this  preferred  sequence  suggests  that  in  the  Twist  + 
E12  protein-DNA  complex,  the  Twist  and  E12  BRs  each  prefer  the  same  half-site 
sequence.  In  contrast,  and  as  observed  previously  (Blackwell  and  Weintraub,  1990), 
MyoD+E12  heterodimers  selected  a  MyoD-like  half  site  at  positions  +4  and  +5,  an  E2A- 
like  half-site  at  -4  and  -5,  and  CC  or  GG  bases  in  the  center  of  the  site  (Figs.  5 A  and  B), 
indicating  asymmetric  binding.  Apparently,  an  E2A  BR  normally  prefers  distinct  half- 
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sites  in  the  context  of  these  two  bHLH  dimerization  partners,  indicating  an 
intermolecular  effect  on  how  it  interacts  specifically  with  DNA. 

To  investigate  how  heterodimer  formation  influences  the  binding  preferences  of 
the  E12  and  MyoD  BRs,  we  performed  in  vitro  selection  on  combinations  of  MyoD  and 
E12  BR  mutants.  When  the  BR  of  one  partner  within  a  MyoD  +  E12  heterodimer  was 
substituted  with  that  of  the  other,  the  heterodimer  binding  preferences  outside  the 
CANNTG  consensus  corresponded  to  those  of  the  individual  BRs.  For  example,  unlike 
MD(E12B)  homodimers  (Figs.  3B  and  C),  heterodimers  of  MD(E12B)+E12  preferred 
wild-type  heterodimer  sequences  in  the  center  of  the  site,  and  selected  E2A-like 
sequences  in  both  flanking  regions,  at  +4  and  +5  (Figs.  5 A  and  B).  A  heterodimer  of 
MyoD  and  an  E12  protein  containing  the  MyoD  BR  (E12(MDB);  Fig.  2A)  similarly 
selected  a  wild-type  heterodimer  preference  within  the  CANNTG  motif,  but  preferred  a 
MyoD-like  sequence  at  ±4  and  ±5  (Figs.  5A  and  B).  In  contrast,  MD(E12B)  +E12(MDB) 
heterodimers  had  a  binding  preference  more  similar  to  Twist  (Figs,  5A  and  B), 
indicating  that  placement  of  each  BR  in  the  protein  context  of  the  other  partner  affected 
binding  over  the  entire  site.  A  striking  aspect  of  our  findings  is  that  each  of  the  mutant 
homo-  or  heterodimer  protein  complexes  that  we  have  examined  selected  sequences 
that  correspond  to  particular  patterns  preferred  by  MyoD,  E2A,  or  Twist  proteins  (Figs. 
3C  and  5B). 

These  in  vitro  selection  findings  were  supported  by  assays  of  binding  to 
individual  sites,  including  a  sequence  from  a  muscle-specific  regulatory  region  (MCK- 
R).  This  site  corresponds  to  the  MyoD  +  E12  heterodimer  in  vitro  binding  preference 
and  responds  to  MyoD  in  vivo,  and  was  used  in  the  original  analysis  of  the  myogenic 
residues  (Blackwell  and  Weintraub,  1990;  Davis  et  al.,  1990;  Weintraub  et  al.,  1991).  In 
an  EMSA,  MyoD  +  E12  heterodimers  bound  with  higher  affinity  to  either  the  MCK-R  or 
MyoD  sites  than  to  the  Twist  site  (Fig.  5C,  lanes  3, 12,  and  21).  MyoD(E12B)  +  E12 
heterodimers  only  slightly  preferred  the  MCK-R  heterodimer  site  to  the  Twist  site,  but 
appeared  to  prefer  either  of  these  sequences  to  the  MyoD  site  (Fig.  5C,  lanes  5, 14,  and 
23).  As  the  preferences  of  MD(E12B-A)  and  MD(E12B-AT)  homodimers  would  predict, 
introduction  of  both  A5  and  Te  into  MD(E12B)  altered  its  sequence  preferences  as  a 
heterodimer  with  E12,  so  that  they  were  more  similar  to  those  of  MyoD  (not  shown). 
MyoD  +  E12(MDB)  heterodimers  only  modestly  preferred  the  MyoD  or  MCK-R  sites  in 
comparison  to  the  Twist  site  (Fig.  5C,  lanes  4, 13,  and  22).  In  contrast,  the  Twist  site  was 
preferred  by  MD(E12B)  +  E12(MDB),  Twist,  and  Twist  +  E12  complexes  (Fig.  5C,  lanes  6, 
8, 9, 15, 17, 18, 24, 26,  and  27). 

Binding  site  competition  and  protein  titration  assays  also  supported  the  in  vitro 
selection  data.  The  MyoD  site  competed  more  effectively  than  the  Twist  site  for  binding 
by  either  MyoD  homodimers  or  MyoD  +  E12  heterodimers  (Figs.  6A  and  6B,  lanes  1, 4, 
7, 10, 13,  and  16).  In  contrast,  the  Twist  site  competed  more  effectively  for  binding  by 
MD(E12B),  MD(E12B)  +  E12,  Twist,  and  Twist  +  E12  complexes,  although  these  latter 
complexes  appeared  to  bind  with  less  specificity  than  did  MyoD  +  E12  complexes  (Figs. 
6C  and  6D,  lanes  2, 3, 5, 6, 8, 9, 11, 12, 14, 15, 17,  and  18).  However,  the  distinct  binding 
specificities  of  MyoD  +  Ei2  and  Twist  +  E12  heterodimers  were  apparent  in  a  protein 
titration  assay,  in  which  the  amount  of  MyoD  or  Twist  protein  was  varied  under 
conditions  of  low  DNA  concentration  (Figs.  7 A  and  B,  lanes  1-6  and  13-18)  that  more 
closely  represent  differences  in  binding  affinity  (Carey,  1991).  Also  in  agreement  with 
results  described  above  (Fig.  5C,  lanes  14  and  23),  heterodimers  of  MD(E12B)  +  E12  bind 
to  the  MCK-R  site  with  decreased  specificity,  and  with  slightly  lower  affinity  than  MyoD 
+  E12  complexes  (Figs.  7 A  and  B,  lanes  7-12). 
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To  investigate  the  role  of  the  BR-HLH  junction  region  in  BR  positioning,  we 
examined  the  DNA  binding  preferences  of  the  MD(E12BJ)  and  E12(MDBJ)  mutants,  each 
of  which  contain  both  the  BR  and  junction  of  the  other  partner  (Fig.  2).  In  contrast  to 
MD(E12B)  +  E12(MDB)  heterodimers  (Figs.  5A  and  B;  Fig.  5C,  lanes  6, 15,  and  24), 
MD(E12BJ)  +  E12(MDBJ)  heterodimers  (Fig.  2A)  bound  to  the  MyoD,  Twist,  and  MCK-R 
sites  with  relative  preferences  that  are  comparable  to  those  of  MD  +  E12  heterodimers 
(Fig.  5C,  lanes  3, 7, 12, 16, 21,  and  25).  Apparently,  the  Twist-like  sequence  preference 
resulting  from  simultaneous  mis-pairing  of  both  the  MyoD  and  E12  BRs  (Fig.  5A  and  B) 
can  be  "corrected"  by  matching  each  of  these  BRs  with  the  corresponding  junction 
region.  Similarly,  and  in  contrast  to  MD(E12B)  homodimers,  MD(E12BJ)  homodimers 
bind  to  the  MyoD,  Twist,  and  MCK-R  sites  with  preferences  that  are  similar  to  E2A 
proteins  (Fig.  8B  and  C,  lane  20;  not  shown).  These  findings  indicate  that  the  BR-HLH 
junction  can  be  critical  for  establishing  the  sequence  specificity  of  an  E2A  BR, 
presumably  because  it  influences  how  the  BR  is  positioned  on  the  DNA. 

Contributions  of  the  BR  and  junction  to  binding  affinity  and  specificity.  It  has  been 
shown  previously  that  introduction  of  A5,  Te,  and  either  the  junction  region  or  K15  of 
MyoD  confers  upon  E12  the  capacity  to  induce  myogenesis  (Fig.  2)(Davis  and 
Weintraub,  1992).  In  the  MyoD-DNA  complex,  A5  and  Te  are  not  positioned  to  allow 
direct  protein-protein  contact  (Fig.  1)  (Ma  et  al.,  1994),  but  we  have  shown  that  they  are 
critical  for  the  DNA  sequence  preferences  of  MyoD,  apparently  because  they  affect  the 
conformation  of  the  BR-DNA  complex.  We  have  also  determined  that  the  junction 
region  can  influence  how  the  E2A  BR  binds  DNA.  These  observations  suggest  the 
possibility  that  the  capacity  for  myogenesis  might  derive  entirely  from  the 
conformation  of  the  DNA-bound  MyoD  BR,  a  model  which  would  predict  that  the 
respective  sequence  preferences  of  each  of  these  bHLH  proteins  might  be  established 
by  amino  acids  at  BR  positions  5, 6,  and  15.  We  have  investigated  this  model  by 
determining  how  individual  substitutions  at  these  positions,  which  have  been  shown  to 
be  critical  in  vivo,  influence  the  DNA  binding  preferences  of  MyoD. 

To  address  the  importance  of  the  MyoD  jimction  region  for  DNA  binding,  we 
substituted  MyoD  positions  14  and  15  (Fig.  8A),  and  left  position  13  intact  because  it  is 
not  required  for  the  MyoD  sequence  preference  in  the  MD(E12B-AT)  mutant  (Figs.  2 
and  3C).  Substitution  of  alanine  for  S14,  which  does  not  interact  with  DNA  (Ma  et  al., 
1994),  increased  binding  affinity  (MD(AK),  Fig.  8 A;  Figs.  8B,  and  C,  lanes  4  and  5), 
perhaps  by  stabilizing  the  BR  helix.  The  preference  of  MD(AK)  for  the  MyoD  site  was 
not  substantially  altered  by  replacement  of  position  15  with  alanine  (MD(AA)),  or  with 
either  glutamic  acid  (MD(AD))  or  serine  (MD(AS)  and  MD(QS)),  which  respectively 
correspond  to  residues  from  E12  and  Twist  (Fig.  8 A;  Figs.  8B  and  C,  lanes  5-9).  The 
relative  preferences  of  these  mutants  for  the  MyoD  site  are  comparable  to  the  binding 
preferences  of  other  proteins  that  were  confirmed  by  binding  competition  analysis 
(Figs.  4  and  6).  Apparently,  appropriately  specific  DNA  binding  by  MyoD  homodimers 
is  not  impaired  by  a  variety  of  BR-HLH  junction  substitutions,  including  non¬ 
conservative  mutations  of  Kjg.  This  flexibility  contrasts  with  the  importance  of  the 
jimction  region  for  positioning  the  E12  BR,  and  with  the  requirement  for  for 
myogenesis. 

To  investigate  the  role  of  BR  positions  5  and  6  in  a  neutral  context,  we  first 
substituted  alanine  for  two  non-conserved  BR  residues  (MD-AAATA,  Fig.  8A)  that  are 
not  predicted  to  be  required  for  DNA  binding  (Fisher  et  al.,  1993;  Ma  et  al.,  1994).  This 
substitution  proportionally  increased  binding  to  both  sites  in  the  context  of  MyoD  (MD- 
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AAATA;  Figs.  8B  and  C,  lane  10),  and  enhanced  specificity  for  the  MyoD  site  in  the 
context  of  MD(AA)  (Fig.  8A;  Figs.  8B  and  C,  lane  12).  Replacement  of  Te  with 
asparagine  conferred  a  preference  for  the  Twist  site  (MD-AAANA,  Fig.  8A;  Figs.  8B  and 
C,  lanes  10  and  13),  a  finding  that  parallels  the  preferences  of  MD(E12B-AT)  and 
MD(E12B-A)  (Figs.  3B  and  C).  This  effect  was  not  diminished  by  various  BR-HLH 
jimction  mutations  or  enhanced  by  presence  of  Twist  junction  residues  (Figs.  8B  and  C, 
lanes  13-17),  indicating  that  Ne  is  the  most  important  of  these  residues  for  the  Twist 
sequence  preference.  To  test  whether  E2A  amino  acids  that  correspond  to  the  three 
myogenic  residues  could  specify  an  E2A-like  DNA  binding  preference,  we  introduced 
an  asparagine  at  BR  position  7  into  MD-AAANA  and  MD-AAANA(AD),  the  latter  of 
which  contains  the  D15  residue  characteristic  of  E2A  proteins  (Fig.  8A).  In  contrast  to 
MD(E12BJ),  these  mutants  strongly  preferred  the  Twist  site  to  the  MyoD  or  MCK-R 
sites  (Figs.  8B  and  C,  lanes  18-20;  not  shown),  indicating  that  establishment  of  an  E2A 
homodimer  sequence  preference  requires  additional  E2A  BR  or  junction  residues,  and 
that  the  conformational  mechanisms  that  dictate  this  asymmetric  sequence  preference 
might  be  complex. 

In  the  examples  we  have  analyzed,  MyoD  mutants  that  lack  myogenic  activity 
bind  preferentially  to  the  Twist  site  (Figs.  2  and  3C),  raising  the  question  of  whether 
changes  in  DNA  binding  preferences  accompany  conversion  of  E12  into  a  "myogenic" 
protein  through  introduction  of  MyoD  BR  and  junction  residues.  E12  homodimers  do 
not  bind  DNA  as  well  as  the  E2A  protein  E47  (Fig.  9,  lanes  1, 2, 8, 9, 15,  and  16),  which 
also  cannot  induce  myogenesis  (Davis  and  Weintraub,  1992).  Introduction  of  the  MyoD 
BR  into  E12  is  not  sufficient  for  myogenesis  (E12(MDB),  Fig.  2),  but  sharply  increased 
binding  of  E12  to  all  three  sites  and  was  associated  with  a  modest  preference  for  the 
MyoD  site  (Fig.  9,  lanes  3, 10,  and  17).  The  E12(MDBJ)  mutant,  which  can  induce 
myogenesis  (Fig.  2),  bound  to  each  of  the  three  sites  at  a  lower  level  than  E12(MDB)  and 
did  not  have  a  markedly  increased  preference  for  either  the  MyoD  or  MCK-R  sites  (Fig. 
9,  lanes  4, 11,  and  18).  Heterodimerization  with  E47  increased  the  relative  levels  with 
which  E12(MDBJ)  bound  to  the  MyoD  and  MCK-R  sites  (Fig.  9,  lanes  6, 7, 13, 14, 20,  and 
21),  but  also  did  not  identify  DNA  binding  effects  that  appear  to  be  sufficient  to  accoimt 
for  the  different  functional  properties  of  E12(MDB)  and  E12(MDBJ).  These  findings 
further  support  the  idea  that  the  MyoD  junction  region  is  not  critical  for  DNA  binding 
(Figs.  8B  and  C,  lanes  4-9),  and  instead  is  important  for  myogenesis  because  it  is 
involved  in  other  interactions  (Davis  and  Weintraub,  1992). 

B.  TTP/TISll  proteins  and  cell  survival; 

Programmed  cell  death  in  response  to  TTP. 

To  test  whether  continuous  TTP  expression  might  impair  cell  viability,  we 
introduced  TTP  into  3T3  cells  by  transient  transfection.  Within  two  days,  many  TTP- 
transfected  cells  appeared  apoptotic  (Fig.  lOA),  were  positive  in  a  TUNEL  assay  (Fig. 
lOB),  and  contained  pyknotic  nuclei  (Fig  IIB).  This  cell  death  increased  between  24  and 
48  hours  after  transfection  (Fig.  IOC),  indicating  a  relatively  slow  onset. 

Within  a  transfected  population,  the  frequency  of  cell  death  was  generally 
proportional  to  the  amount  of  expression  construct  introduced,  but  in  individual  cells 
only  modest  levels  were  required  (Fig.  11).  Introduction  of  either  50  or  200  ng  of  TTP 
expression  vector  triggered  apoptosis  in  a  significant  fraction  of  transfected  cells  (Fig. 
IIC).  In  these  experiments,  transfection  efficiencies  ranged  between  25%  and  50%  (not 
shown),  but  TTP  protein  could  not  be  detected  by  western  blotting  (Fig.  HA,  lanes  7 
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and  8).  In  contrast,  endogenous  TTP  was  readily  detectable  after  serum  stimulation 
(Fig.  IIA,  lanes  2  and  3);  although  by  immunofluorescence  it  was  barely  apparent 
above  background  (not  shown).  After  introduction  of  50  ng  TTP  expression  vector, 
most  dead  cells  (72%)  did  not  express  TTP  at  levels  that  were  detectable  by 
immunofluorescence  (Fig.  IIC).  When  200  ng  of  expression  vector  was  introduced,  a 
moderately  higher  proportion  of  apoptotic  than  non-apoptotic  cells  expressed  TTP  at 
just  visible  levels  (41%  versus  25%),  but  TTP  was  still  undetectable  in  many  apoptotic 
cells  (38%)  (Figs.  IIB  and  IIC).  In  each  of  these  transfections,  the  proportion  of  cells 
expressing  TTP  at  high  levels  did  not  appear  to  be  elevated  in  the  apoptotic  fraction, 
perhaps  because  some  of  these  apoptotic  cells  had  become  detached  from  the  plate  (not 
shown).  Considering  the  high  transfection  efficiency,  the  observation  that  a  transfected 
population  expressed  less  TTP  than  serum-induced  cells  (Fig.  11  A,  lanes  2, 3, 7,  and  8) 
suggests  that  many  apoptotic  transfected  cells  expressed  TTP  at  levels  that  were 
comparable  to  or  lower  than  those  resulting  from  serum  stimulation.  This  indicates 
that  excessively  high  levels  of  TTP  are  not  required  for  its  induction  of  cell  death. 

Various  cell  types  undergo  apoptosis  in  response  to  TTP,  including  primary  cells 
(Fig.  IOC),  demonstrating  that  immortalization  is  not  a  prerequisite.  In  U20S  and 
SAOS2  cells,  death  was  delayed  (Fig.  IOC)  and  peaked  after  72  hours,  perhaps  because 
of  their  slower  growth  rates  (not  shown).  The  lack  of  functional  p53  in  SAOS2  cells 
(Chandar  et  al.,  1992)  also  may  have  decreased  but  did  not  prevent  their  apoptotic 
response,  indicating  that  p53  is  not  required.  The  frequency  of  apoptosis  was  relatively 
low  in  293  cells,  particularly  at  48  hours  after  transfection  (Fig.  IOC),  and  was  not 
increased  by  additional  TTP  expression  vector  (not  shown).  The  decreased  apoptotic 
response  in  this  cell  line,  in  which  the  effects  of  TTP  on  the  TNF-a  mRNA  have  been 
studied  (Carballo  et  al.,  1998;  Lai  et  al.,  1999)  is  likely  to  derive  from  its  expression  of 
adenovirus  ElB  19K,  which  mimics  the  anti-apoptotic  protein  Bcl-2  (Han  et  al.,  1998). 

The  C.  elegans  POS-1  and  PIE-1  proteins  each  contain  two  related  zinc  fingers  (Mello  et 
al.,  1996;  Tabara  et  al.,  1999)  but  did  not  cause  significant  apoptosis  in  this  assay  (not 
shown),  indicating  that  not  all  CySgHis  zinc  finger  proteins  trigger  cell  death  when 
constitutively  expressed.  Cell  death  in  response  to  TTP  was  decreased  by  mutation  of 
the  first  CySgHis  zinc  finger,  and  abrogated  by  alteration  of  both  (Fig.  12),  supporting 
the  idea  that  it  is  caused  by  TTP  acting  on  appropriate  targets. 

Similarities  between  TTP/TISll-  and  oncogene-induced  apoptosis. 

When  expressed  constitutively,  TISllb  and  TISlld  also  triggered  apoptosis 
associated  with  TUNEL-positive  nuclei  (not  shown).  Each  TTP/TISll  protein  induced 
cell  death  with  similar  frequency  and  timing  over  a  range  of  expression  vector  amoimts 
(Figs.  13A  and  B),  and  also  as  fusions  with  green  fluorescent  protein  (GFP;  not  shown). 
We  have  not  developed  assays  for  detecting  endogenous  TISllb  and  TISlld  proteins, 
nor  could  we  detect  the  corresponding  TTP/TISll-GFP  fusion  proteins  after 
introduction  of  the  modest  DNA  amounts  used  in  Figures  13A  and  B.  When  these  GFP 
fusion  proteins  were  overexpressed  as  in  Figures  11 A  (lane  9)  or  12B,  however,  western 
blotting  with  a  GFP  antibody  revealed  that  they  were  present  at  similar  levels  (not 
shown).  This  suggests  that  the  three  TTP/TISll  proteins  are  also  similarly  expressed 
when  introduced  at  low  levels  in  cell  death  assays,  and  that  they  trigger  apoptosis 
comparably. 

The  hypothesis  that  TTP/TISll  proteins  affect  cell  growth  or  survival  signals 
suggests  that  they  might  stimulate  apoptosis  analogously  to  some  oncoproteins.  The 
immediate  early  protein  c-Myc  is  involved  in  G1  entry,  and  possibly  in  physiological 
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apoptotic  events  (Shi  et  al.,  1992;  Zhan  et  al.,  1997).  Its  forced  expression  under  low 
serum  conditions  triggers  apoptosis  over  a  similar  time  course  as  TTP/TISll  proteins, 
apparently  by  stimulating  growth  or  apoptotic  pathways  in  the  absence  of  survival 
signals  (Juin  et  al.,  1999).  Ihiis  apoptosis  is  enhanced  when  c-Myc  is  overexpressed,  but 
can  be  detected  when  it  is  expressed  constitutively  at  the  levels  that  are  observed 
following  serum  induction  (Evan  et  al.,  1992).  It  is  inhibited  when  survival  signals  are 
restored  by  treatment  with  insulin-like  growth  factor  1  (IGF-1),  it  requires  the 
mitochondrial  death  machinery,  and  it  can  involve  p53  and  cell-surface  death  receptors 
(Juin  et  al.,  1999).  In  apparent  contrast,  the  S-phase  transcription  factor  E2F-1  induces 
apoptosis  in  the  presence  of  serum,  and  can  do  so  independently  of  p53  when  it  is 
expressed  at  sufficient  levels  (Hsieh  et  al.,  1997;  Kowalik  et  al.,  1995;  Phillips  et  al.,  1997). 

Cell  death  in  response  to  TTP/TISll  proteins  appears  to  be  analogous  to  these 
apoptotic  events  in  various  respects.  In  each  case,  it  was  prevented  by  co-expression  of 
Bcl-2,  which  iidiibits  the  mitochondrial  death  machinery  but  not  direct  caspase 
activation  by  death  receptors  (Fig.  13C)(Gross  et  al.,  1999).  It  was  also  partially 
alleviated  by  the  CrmA  protein  (Fig.  13C),  an  effective  inhibitor  of  death  receptor- 
activated  caspases  (Ashkenazi  and  Dixit,  1998).  The  latter  finding  suggests  that 
TTP/TISll-stimulated  apoptosis  might  involve  death  receptors,  but  it  is  also  possible 
that  its  inhibition  by  CrmA  derives  from  effects  on  caspases  activated  by  the 
mitochondrial  machinery  (Gross  et  al.,  1999).  Serum  withdrawal  is  not  a  precondition 
for  TTP/nSll-stimulated  apoptosis,  but  it  markedly  increased  the  frequency  of  death 
(Fig.  13D).  This  increase  was  not  abrogated  by  nutrient  replacement,  and  was  offset  by 
IGF-1  treatment  (Fig.  13D),  supporting  the  model  that  it  involves  a  lack  of  survival 
signals.  Apparently,  constant  TTP/TISll  protein  expression  can  overcome  or 
circumvent  the  survival  signals  provided  by  serum,  but  also  stimulates  apoptosis  more 
rapidly  when  these  signals  are  lacking. 


Synergistic  induction  of  apoptosis  by  TTP  and  TNF-a. 

TNF-a  stimulates  apoptosis  by  binding  to  its  Type  I  receptor,  an  event  which 
triggers  caspases  directly  (Ashkenazi  and  Dixit,  1998).  Simultaneously,  however,  this 
binding  can  activate  anti-apoptotic  genes,  which  apparently  must  remain  silent  if  cell 
death  is  to  occur  (Grumont  et  al.,  1999;  Wang  et  al.,  1998;  Wu  et  al.,  1998;  Zong  et  al., 
1999).  TNF-a-induced  apoptosis  is  enhanced  by  expression  of  the  oncoproteins  c-Myc, 
adenovirus  ElA,  and  E2F-1,  and  by  lack  of  the  tumor  suppressor  Rb,  and  it  is  impaired 
by  inhibition  of  either  c-Myc  or  cell  cycle  progression  (Janicke  et  al.,  1994;  Klefstrom  et 
al.,  1994;  Meikrantz  and  Schlegel,  1996;  Phillips  et  al.,  1999).  These  findings  indicate  that 
pathways  which  regulate  growth  or  proliferation  can  influence  how  a  cell  responds  to 
TNF-a. 

TNF-a  treatment  induces  TTP  mRNA  expression  (Carballo  et  al.,  1998), 
suggesting  the  possibility  that  TTP/TISll  proteins  might  also  affect  responses  to 
TNF-a.  To  test  this  idea,  we  added  TNF-a  to  3T3  cells  that  were  transfected  with 
TTP/TISll  expression  vectors  in  awmounts  that  triggered  only  modest  cell  death  (Fig. 
14).  Administration  of  TNF-a  shortly  after  transfection  dramatically  increased 
apoptosis  in  cells  that  expressed  TTP  but,  surprisingly,  not  TISllb  or  TISlld  (Fig.  14A). 
TTae  same  trends  were  observed  when  TNF-a  was  added  after  TTP  had  been  expressed 
for  24  hours  after  transfection  (Fig.  14B).  In  these  TTP-expressing  cells,  death  was  also 
increased  by  incubation  with  TNF-a  for  only  four  hours  in  the  presence  of 
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cycloheximide  (Fig.  14C),  which  blocks  its  induction  of  anti-apoptotic  proteins.  The 
rapidity  of  this  last  effect  is  more  consistent  with  apoptosis  induced  by  death  receptors 
(Ashkenazi  and  Dixit,  1998)  than  with  the  slower  time  course  of  TTP-stimulated  cell 
death  (Fig.  IOC),  suggesting  that  TTP  has  sensitized  these  cells  to  the  apoptotic  stimulus 
of  TNF-a.  Similar  results  were  obtained  in  parallel  experiments  performed  in  HeLa 
cells  (not  shown).  Supporting  the  idea  that  this  effect  is  specific  to  TTP,  expression  of 
TISllb  and  TISlld  at  higher  levels  increased  the  overall  level  of  cell  death,  but  did  not 
promote  induction  of  additional  apoptosis  by  TNF-a  (not  shown). 

KEY  RESEARCH  ACCOMPLISHMENTS 


—  Demonstration  that  the  conformation  with  which  bHLH  proteins  bind  DNA  is  of 
critical  importance,  and  that  binding  affinity  per  se  is  not  necessarily  sufficient  for  their 
activity.  These  observations  may  have  important  implications  for  understanding 
functions  of  c-Myc,  which  have  been  implicated  in  breast  cancer. 

--  The  immediate  early  protein  TTP,  which  is  induced  by  EGF  and  various  mitogens, 
causes  apoptosis  when  it  is  expressed  constitutively  at  levels  to  which  it  is  normally 
induced  transiently.  This  apoptosis  appears  analogous  to  that  induced  by  constitutive 
expression  of  c-Myc,  E2F-1  and  other  oncogenes,  and  appears  to  involve  affects  on 
growth  and  survival  pathways.  The  data  indicate  that  TTP  and  the  related  TISll 
proteins  act  on  these  pathways  when  they  are  expressed  transiently. 

-  TTP  in  particular  sensitizes  cells  to  TNF-a-induced  apoptosis,  perhaps  similarly  to  c- 
Myc  and  E2F-1.  This  finding  could  be  relevant  to  the  phenotype  of  TTP  knockout  mice, 
which  suffer  from  a  widespread  polyinflammatory  and  myeloproliferative  syndrome 
that  is  mediated  by  TNF-a. 
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CONCLUSIONS 


A.  Determinants  of  bHLH  protein  DNA  binding  specificity: 

bHLH  protein  DNA  binding  specificity  deriving  from  effects  on  BR-DNA 
conformation.  The  myogenic  MyoD  BR  residues  A5  and  Te  are  essential  for 
myogenesis,  but  not  for  binding  of  MyoD  +  E2A  heterodimers  to  a  muscle-specific  site 
in  vitro  or  in  vivo  (Davis  and  Weintraub,  1992;  Weintraub  et  al.,  1991).  However,  we 
have  determined  that  these  residues  are  required  for  MyoD  to  bind  DNA  with  its 
characteristic  specificity  for  particular  CANNTG  sites.  Substitution  of  asparagine  for  T6, 
and  especially  for  both  A5  and  T^,  results  in  MyoD  binding  preferentially  to  a  Twist  site 
(Figs.  8B  and  C,  lanes  10, 13,  and  18).  The  Twist-like  MD(E12B)  sequence  preference  is 
affected  partially  by  substitution  of  A5  for  the  corresponding  asparagine  (MD(E12B-A), 
Fig.  3C),  but  is  reconfigured  by  introduction  of  both  A5  and  Te  so  that  it  is 
indistinguishable  from  that  of  wild-type  MyoD  (MD(E12B-AT),  Fig.  3C).  The  data 
indicate  that  MyoD  residues  A5  and  T6  are  each  critical  for  its  DNA  binding  sequence 
preferences,  and  that  the  N6  residue,  which  is  common  to  the  Twist  and  MD(E12B-A) 
BRs  (Fig.  2),  is  important  for  the  Twist-like  preference.  Mutations  of  these  individual  BR 
residues  alter  sequence  preferences  across  each  half-site  (Fig.  3C),  raising  the  question 
of  how  they  might  have  such  a  global  effect  on  how  the  BR  helices  and  the  DNA 
interact  preferentially  with  each  other. 

A  structure  of  MyoD  obtained  by  X-ray  crystallography  suggests  how  A5  and  T6 
might  influence  binding  sequence  specificity.  When  bound  to  its  preferred  recognition 
site,  MyoD  does  not  directly  contact  base  pairs  that  it  specifies  in  the  center  of  and 
flanking  the  CANNTG  consensus  (Ma  et  al.,  1994).  However,  A5  and  Te  allow  the 
MyoD  BR  helix  to  pack  more  tightly  into  the  major  groove  than  do  the  corresponding 
N5  and  N6  residues  of  E2A  proteins,  in  part  because  of  their  smaller  sizes  (Figs.  1  and 
2)(Ma  et  al.,  1994).  As  a  result,  the  MyoD  BR  residues  and  R2  directly  contact 
CANNTG  bases  at  ±2  and  ±3  respectively,  and  Ri  binds  a  backbone  phosphate  at  ±6 
(Fig.  l)(Ma  et  al.,  1994).  In  contrast,  in  E47  R2  swings  out  of  the  major  groove  and 
contacts  the  backbone,  and  the  residue  at  position  1  does  not  interact  directly  with  the 
DNA  (Brownlie  et  al.,  1997;  Ellenberger  et  al.,  1994).  Supporting  the  idea  that  A5  and  T6 
influence  the  conformation  of  the  DNA-bound  BR,  substitution  of  asparagine  for  A5  in 
MyoD  increases  its  sensitivity  to  protease  digestion  (Huang  et  al.,  1998).  Our  findings 
suggest  that  protein-DNA  interactions  that  depend  specifically  upon  the  MyoD  A5  and 
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Te  residues  may  directly  influence  how  the  BR  helix  interacts  preferentially  with  the 
DNA,  and  thereby  indirectly  specify  its  characteristic  sequence  preferences  at  positions 
within  and  flanking  the  CANNTG  consensus. 

Such  indirect  conformational  effects  also  appear  to  be  critical  for  the  E2A  and 
Twist  sequence  preferences.  When  E47  homodimers  bind  DNA,  a  single  subunit 
contacts  a  base  in  the  center  of  the  site  through  Rjg  (Fig.  2).  This  interaction  could  be 
important  for  the  asymmetric  E2A  homodimer  sequence  preference  (Ellenberger  et  al., 
1994).  However,  the  Twist-like  sequence  preference  that  is  characteristic  of  Twist  +  E2A 
heterodimers  and  MD(E12B)  homodimers  is  different  across  each  5  bp  half-site  and 
symmetric  (Figs.  3C  and  5B),  suggesting  that  it  is  likely  to  be  established  indirectly, 
through  an  intermolecular  effect  that  involves  a  distinct  positioning  of  the  BR  helix. 
Introduction  of  the  E12  BR-HLH  junction  region  into  MD(E12B)  "corrects"  its  binding 
preference  so  it  is  like  that  of  E2A  homodimers  (MD(E12BJ),  Fig.  5C,  lanes  7, 16,  and  25; 
Figs.  8B  and  C,  lane  20),  implicating  the  BR-HLH  junction  in  this  effect.  Presumably,  the 
E2A  junction  acts  in  concert  with  the  asparagines  at  BR  positions  5  and  6  (Fig.  2), 
although  the  Twist-like  preference  of  the  MD-AANNA(AD)  mutant  (Figs.  8B  and  C, 
lane  19;  not  shown)  suggests  that  the  E2A  junction  residue  D15  is  not  sufficient.  The 
finding  that  E2A  proteins  can  be  targeted  to  different  DNA  sequences  by  different 
dimer  partners  may  have  important  implications  for  their  in  vivo  functions. 

In  contrast,  the  BR-HLH  junction  region  does  not  have  a  strong  influence  on  the 
MyoD  DNA  binding  preference.  Various  MyoD  junction  mutations  do  not  substantially 
diminish  its  preference  for  a  MyoD  site  (Figs.  8B  and  C,  lanes  5-9).  In  addition,  the 
similar  sequence  preferences  of  E12(MDB)  and  E12(MDBJ)  homodimers  (Fig.  9,  lanes  3, 
4, 10, 11, 17,  and  18)  contrast  sharply  with  the  different  specificities  of  MD(E12B)  and 
]S4D(E12BJ)(Figs.  3D,  lanes  2  and  6,  and  8B  and  C.  lane  20).  This  apparent  difference 
between  MyoD  and  E2A  proteins  might  derive  from  the  distinct  arrangement  of  the  BR 
helix  on  the  DNA  that  results  from  presence  of  MyoD  residues  A5  and  Te- 

It  is  striking  that,  as  a  group,  these  various  bHLH  mutants  and  dimer 
combinations  bind  DNA  with  a  limited  number  of  discrete  sequence  preferences  (Figs. 
3C  and  5B).  Presumably,  each  of  these  preferences  reflects  a  preferred  conformational 
"state"  that  is  dictated  by  how  each  BR  helix  and  the  corresponding  DNA  sequence 
conform  to  each  other  in  an  induced  fit  (Spolar  and  Record  Jr.,  1994).  This  mechanism 
for  recognizing  particular  CANNTG  sites  appears  to  be  different  from  the  direct 
recognition  of  central  bases  that  is  characteristic  of  bHLH  proteins  that  contain  R13,  and 
bind  to  CACGTG  or  CATGTG  sites  (Ferre-D’  Amare  et  al.,  1993;  Ferre-D'Amare  et  al., 
1994;  Shimizu  et  al.,  1997).  Consistent  with  this  idea,  BR  residues  5  and  6  do  not  appear 
to  be  important  for  the  function  of  the  Rjj-containing  bHLH  protein  c-Myc  (Bodis  et  al., 
1997).  In  E2A  and  its  tissue-specific  dimerization  partners,  a  more  flexible 
conformation-based  mechanism  might  have  evolved  to  increase  adaptability  in  both 
sequence  recognition  and  function,  so  that  different  combinations  of  these  proteins  can 
result  in  distinct  protein-DNA  conformations  that  correspond  to  particular  DNA 
sequence  preferences.  Such  a  model  may  be  particularly  plausible  for  bHLH  proteins, 
because  folding  of  the  BR  into  an  _-helix  is  driven  by  its  interaction  with  the  DNA 
(Anthony-Cahill  et  al.,  1992). 

BR-DNA  conformation,  DNA  binding  specificity,  and  myogenesis.  The  observation 
that  the  MyoD  junction  and  Kjg  are  not  required  for  an  appropriate  DNA  binding 
specificity  (Fig.  8B  and  C,  lanes  6-9;  Fig.  9),  supports  the  model  that  is  involved  in 
other  essential  interactions  (Davis  and  Weintraub,  1992).  However,  our  experiments 
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also  pose  the  question  of  how  the  functional  importance  of  Ag  and  might  be  related 
to  their  effects  on  DNA  recognition.  Of  the  MyoD  BR  mutants  we  have  analyzed,  those 
that  do  not  induce  myogenesis  bind  to  DNA  as  homodimers  with  a  Twist-like 
preference  (MD(E12B)  and  MD(E12B-A),  Figs.  2  and  3C).  Heterodimers  of  MD(E12B) 
with  E12  prefer  a  heterodimer  site  (Fig.  5B),  but  with  markedly  diminished  specificity 
compared  to  MyoD  +  E12  dimers  (Fig.  5C,  lanes  3, 5, 12, 14, 21,  and  23;  Fig.  6;  Fig.  7A 
and  B,  lanes  1-12).  This  suggests  that,  at  least  in  part,  A5  and  Te  might  be  significant  for 
myogenesis  because  they  restrict  the  DNA  binding  specificity  of  MyoD  and  other 
myogenic  bHLH  proteins,  so  that  they  are  less  likely  to  bind  inappropriate  sites. 
However,  other  observations  support  a  role  for  the  A5  and  T6  residues  in  protein- 
protein  interactions.  They  have  been  implicated  in  binding  to  other  proteins  off  the 
DNA  (Hamamori  et  al.,  1997;  Molkentrn  et  al.,  1995),  and  evidence  indicates  that  they 
are  required  for  activation  domain  exposure  (Black  et  al.,  1998;  Huang  et  al.,  1998; 
Weintraub  et  al.,  1991),  and  cooperative  DNA  binding  (Bengal  et  al.,  1994).  Finally, 
rmlike  MyoD,  MD(E12B)  can  activate  transcription  of  a  reporter  only  in  particular  cell 
lines,  implicating  the  BR  in  protein-protein  interactions  (Weintraub  et  al.,  1991). 

In  light  of  evidence  that  A5  and  Te  establish  the  conformation  of  the  DNA-bound 
BR,  it  is  an  attractive  model  that  this  effect  might  influence  the  function  of  myogenic 
bHLH  proteins  directly,  by  affecting  their  interactions  with  other  proteins.  Given  that 
relatively  subtle  alterations  of  the  MyoD  BR  and  junction  region  can  enhance  MyoD 
DNA  binding  significantly  (MD(AK)  and  MD(AAATA),  Fig.  8B  and  C,  lanes  4, 5,  and  10), 
it  appears  likely  that  cooperative  protein-protein  interactions  with  the  BR  and  junction 
could  influence  binding  affinity.  It  has  been  demonstrated  recently  that  MyoD  binds 
cooperatively  with  other  DNA  binding  proteins  to  a  particular  muscle-specific  promoter 
(Biesiada  et  al.,  1999).  The  E  box  sequences  through  which  MyoD  activates  transcription 
in  the  context  of  this  promoter  can  differ  from  those  it  binds  preferentially  in  vitro 
(Huang  et  al.,  1996),  suggesting  that  DNA  sequence  recognition  may  be  influenced  by 
interactions  with  cooperating  proteins  in  vivo.  In  addition,  interactions  with  cooperating 
proteins  might  be  influenced  in  turn  by  the  specificity  of  DNA  sequence  recognition,  as 
suggested  by  evidence  that  for  MyoD  and  E  proteins  the  choice  between  homo-  or 
heterodimer  formation  may  be  dictated  by  the  DNA  binding  affinities  of  the  individual 
BRs  (Maleki  et  al.,  1997;  Wendt  et  al.,  1998).  Our  findings  are  consistent  with  the  idea 
that  deceptively  subtle  aspects  of  sequence  recognition  could  be  important  for  the 
biological  activity  of  MyoD,  if  they  influence  functionally  critical  interactions  that  might 
also  involve  K^g,  or  other  MyoD  regions.  They  also  suggest  that  manipulation  of  these 
interactions  could  be  have  important  consequences  for  the  functions  of  these  proteins, 
and  that  these  concepts  may  be  applicable  to  DNA  recognition  by  other  bHLH  proteins 
such  as  c-Myc. 

B.  TTP/TISll  proteins  and  cell  survival: 

When  expressed  constitutively  at  approximately  physiological  levels,  the 
TTP/TISll  immediate  early  proteins  each  trigger  apoptosis  with  comparable  frequency 
and  timing  (Figs.  13A  and  B),  suggesting  that  they  act  similarly  to  each  other  on  related 
or  overlapping  regulatory  pathways.  Various  findings  suggest  that  they  induce  cell 
death  analogously  to  oncoproteins  such  as  c-Myc,  ElA,  and  E2F-1  (Han  et  al.,  1998; 
Hsieh  et  al.,  1997;  Juin  et  al.,  1999;  Kowalik  et  al.,  1995;  Phillips  et  al.,  1997;  Phillips  et  al., 
1999).  For  example,  in  contrast  to  the  rapid  stimulus  associated  with  death  receptor 
triggering  (Fig.  14C),  TTP/TISll  proteins  cause  apoptosis  over  24-to-48  hours  (Figs. 
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13A  and  B).  This  apoptosis  is  accelerated  when  survival  signals  are  absent  (Fig.  13D) 
and  is  dependent  upon  the  mitochondrial  machinery  (Fig.  13C),  which  is  responsive  to 
abnormal  growth  regulation,  survival  signals,  and  stress  (Gross  et  al.,  1999).  TTP  in 
particular  also  appears  to  be  analogous  to  these  oncoproteins  in  that  it  can  sensitize  cells 
to  induction  of  apoptosis  by  TNF-a  (Fig.  14).  Unlike  c-Myc,  however,  TTP/TISll 
proteins  can  trigger  apoptosis  in  the  presence  of  serum,  and  in  this  respect  may  be 
comparable  to  E2F-1,  which  stimulates  DNA  replication  and  cell  cycle  progression 
(Nevins,  1998;  Phillips  et  al.,  1999).  In  some  experiments,  TTP  expression  increased  the 
number  of  cells  present  at  24  hours  after  transfection  (not  shown),  suggesting  that  it 
might  also  promote  proliferation. 

Precedents  set  by  various  other  CySgHis  zinc  finger  proteins  predict  that 
TTP/TISll  proteins  are  likely  to  have  RNA-associated  functions  (Barabino  et  al.,  1997; 
Guedes  and  Priess,  1997;  Murray  et  al.,  1997;  Rudner  et  al.,  1998;  Tabara  et  al.,  1999; 
Tronchere  et  al.,  1997),  and  evidence  indicates  that  TTP  can  bind  and  influence  the 
stability  of  TNF-a  and  other  ARE-containing  cytokine  mRNAs  (Carballo  et  al.,  1998;  Lai 
et  al.,  1999).  We  have  determined  that  TTP/TISll  proteins  influence  cell  growth  and 
survival  mechanisms,  suggesting  that  they  may  also  act  on  mRNAs  that  are  involved  in 
those  pathways.  During  growth  factor  responses,  the  localization,  stability,  and 
translation  of  various  mRNAs  are  regulated  through  AREs  that  are  distinct  from  but 
related  to  those  of  cytokine  mRNAs  (Chen  and  Shyu,  1995;  Ross,  1995).  Such  mRNAs 
might  be  candidate  TTP/TISll  protein  targets,  but  it  will  be  important  to  discriminate 
among  the  direct  and  indirect  effects  of  these  proteins  because  an  extremely  large 
number  of  genes  are  regulated  during  these  responses  (Iyer  et  al.,  1999). 

The  apoptotic  effect  of  expressing  TTP/TISll  proteins  constitutively  might 
derive  simply  from  their  being  present  at  inappropriate  phases  of  the  cell  cycle. 
However,  other  immediate-early  proteins  that  are  associated  with  growth  or 
proliferation  are  involved  in  apoptosis  (Hafezi  et  al.,  1997;  Shi  et  al.,  1992;  Zhan  et  al., 
1997),  and  TTP  is  expressed  during  apoptotic  events  (Harkin  et  al.,  1999;  Mesner  et  al., 
1995),  suggesting  that  TTP/TISll  proteins  might  normally  promote  apoptosis  in  some 
contexts.  The  observation  that  TTP  can  sensitize  cells  to  the  apoptotic  effects  of  TNF-a 
is  consistent  with  this  idea.  The  apparently  similar  sensitization  by  c-Myc  has  been 
proposed  to  involve  the  mitochondrial  death  machinery  (Juin  et  al.,  1999),  and  the 
analogous  effects  of  E2F-1  have  been  linked  to  down-regulation  of  anti-apoptotic 
mechanisms  (Phillips  et  al.,  1999).  TISllb  and  TISlld  can  stimulate  the  mitochondrial 
death  machinery  (Fig.  13C)  but  do  not  sensitize  cells  to  TNF-a-induced  apoptosis  (Fig. 
14),  suggesting  that  TTP  acts  on  additional  pathways.  TTP  might  influence  TNFa- 
induced  apoptosis  by  acting  on  growth  related  pathways  distinct  from  those  affected  by 
TISllb  and  TISlld,  or  by  interfering  with  anti-apoptotic  gene  expression. 

Our  findings  raise  the  question  of  why,  in  mice,  lack  of  TTP  causes  a  specific 
defect  in  TNF-a  regulation.  Although  changes  in  overall  TISllb  and  TISlld  mRNA 
levels  have  not  been  detected  in  various  TTP  -/-  mouse  tissues  (Taylor  et  al.,  1996),  it 
remains  possible  that  TTP/TISll  proteins  could  have  partially  redundant  functions. 
Alternatively,  TTP  might  function  in  growth  regulatory  pathways  that  are  largely 
redundant  with  other  mechanisms.  TNF-a  expression  is  regulated  by  a  complex  array 
of  transcriptional  and  post-transcriptional  mechanisms  that  respond  to  numerous 
inputs  (Beutler  et  al.,  1992).  By  suggesting  that  TTP  expression  influences  cell  growth  or 
survival  pathways,  and  sensitizes  cells  to  TNF-a  induced  apoptosis,  our  experiments 
have  identified  avenues  through  which  it  might  influence  TNF-a  expression  or 
responses  indirectly.  They  indicate  that  elucidation  of  how  TTP/TISll  proteins  act  on 
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their  targets  not  only  will  reveal  critical  aspects  of  TNF-a  regulation,  but  also  may 
imcover  post-transcriptional  gene  regulation  mechanisms  that  are  involved  in  cellular 
responses  to  multiple  stimuli,  including  EGF  and  various  mitogens. 

C.  Figure  legends: 

Figure  1. 

A  MyoD-DNA  complex.  In  this  X-ray  crystallographic  structure  (Ma  et  al.,  1994), 
a  MyoD  homodimer  is  bound  to  the  sequence  AACAGCTGTT.  which  corresponds  to  its 
preferred  recognition  consensus  (Blackwell  and  Weintraub,  1990).  Residues  are 
numbered  as  in  full  length  MyoD,  and  their  positions  as  specified  as  in  Fig.  2  and  the 
text  are  indicated  in  parentheses.  Binding  site  positions +5  (numbered  as  in  Fig.  3A)  are 
indicated  by  grey  numerals.  Side  chains  are  shown  only  for  the  myogenic  residues 
(green)  (Davis  and  Weintraub,  1992)  and  Arg  111  (R2)  (gold). 

Figure  2. 

Myogenic  activity  of  MyoD  and  E12  BR  and  junction  mutants.  Each  of  these 
mutants  has  been  described  previously  (Davis  and  Weintraub,  1992;  Weintraub  et  al., 
1991),  and  is  compared  with  sequences  from  mouse  MyoD,  E12,  and  Twist.  Amino 
acids  that  are  identical  to  those  of  MyoD  are  underlined,  positions  that  are  conserved  in 
most  bHLH  proteins  are  shaded,  and  entire  BR  and  junction  regions  that  have  been 
swapped  are  bracketed.  The  column  "muscle"  indicates  the  relative  activity  of  these 
proteins  when  assayed  previously  by  transfection  for  conversion  of  cultured  cells  into 
muscle  (Davis  and  Weintraub,  1992;  Weintraub  et  al.,  1991).  The  "++++"  indicates  the 
frequency  of  myogenic  conversion  obtained  with  wild-type  MyoD,  the  "++"  30-50%  of 
that  obtained  with  MyoD,  and  the  +  indicates  5-30%  of  that  obtained  with  wild-type 
MyoD.  "No"  indicates  that  myogenic  conversion  was  not  detected,  and  "ND"  indicates 
not  done. 

Figure  3. 

In  vitro  selection  assay  of  binding  site  preferences.  (A)  Core  sequences  of  the 
random  sequence  oligonucleotide  libraries  D3  and  D6  (Blackwell  et  al.,  1990;  Blackwell 
and  Weintraub,  1990).  In  each  library,  the  bases  shown  are  flanked  by  sequences  which 
correspond  to  primers  (A  and  B)  that  allow  selected  sequences  to  be  recovered  by  PGR. 
A'  indicates  that  primer  A  corresponds  to  the  opposite  strand.  (B)  Sequences  of 
preferred  binding  sites.  Starting  with  the  D6  oligonucleotide  random  sequence  library 
(A),  three  rounds  of  sequential  selection  and  PCR  amplification  were  performed  for 
binding  to  the  proteins  indicated.  A  sample  of  the  final  selected  population  of  binding 
sites  was  then  sequenced  directly  as  a  pool  and  analyzed  by  autoradiography.  The 
MyoD  preferences  at  positions  +1  described  previously  (Blackwell  and  Weintraub,  1990) 
are  more  prominent  after  additional  selection  rounds  (not  shown).  (C)  Summary  of 
sequence  preferences  identified  by  in  vitro  selection  in  (B).  MyoD  and  E2A  homodimer 
preferences  were  described  in  (Blackwell  and  Weintraub,  1990).  Binding  site  positions 
are  numbered  as  in  B,  and  grey  letters  indicate  bases  that  were  selected  against.  The 
CANNTG  consensus  that  was  fixed  in  these  experiments  is  underlined.  (D)  Binding  of 
MyoD  BR  mutants  to  individual  oligonucleotide  sites,  which  differed  only  at  the 
sequences  shown.  In  this  EMSA,  which  was  analyzed  by  phosphorimaging,  each 
sample  contained  the  indicated  in  vitro  translated  protein  at  a  concentration  of  40  pM, 
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and  DNA  that  was  labeled  to  the  same  specific  activity  at  550  pM.  Specific  and 
background  species  are  indicated  by  open  and  closed  triangles,  respectively. 

Figure  4. 

Specificity  of  MyoD  BR  mutant  DNA  binding.  (A)  Competition  analysis  of 
binding  to  the  labeled  MyoD  preferred  site,  analyzed  by  EMSA  and  autoradiography. 
The  indicated  in  vitro  translated  proteins  and  DNA  labeled  to  the  same  specific  activity 
were  present  at  concentrations  of  50  pM  and  900  pM,  respectively.  When  the  samples 
were  mixed,  unlabeled  competitor  DNA  sites  were  added  at  the  indicated  ratios  relative 
to  the  labeled  probe.  (B)  Competition  analysis  of  binding  to  the  Twist  preferred  site, 
performed  as  in  (A). 

Figure  5. 

Binding  site  preferences  of  MyoD,  E2A,  and  Twist  heterodimer  complexes.  (A) 

In  vitro  selection  analysis  of  binding  site  preferences.  Four  rounds  of  selection  from  the 
D3  library  (Fig.  3A)  were  performed  as  for  each  in  vitro  translated  protein  complex.  In 
each  case,  the  heterodimer  complex  could  be  easily  identified  in  the  EMSA  on  the  basis 
of  mobility  (Blackwell  and  Weintraub,  1990),  particularly  because  E12  homodimers  bind 
DNA  poorly  (Fig.  9).  In  the  Twist  homodimer  selection,  binding  to  Twist/E12 
heterodimers  was  selected  for  in  the  first  round,  because  of  the  relatively  low  level  of 
Twist  homodimer  binding.  Subsequent  rounds  were  performed  using  Twist 
homodimers.  Each  sample  was  analyzed  by  sequencing  and  autoradiography  as  in  Fig. 
3B.  (B)  Summary  of  sequence  preferences  identified  in  (A),  depicted  as  in  Fig.  3C. 

MyoD  +  E2A  heterodimer  preferences  were  also  described  previously  in  (Blackwell  and 
Weintraub,  1990).  (C).  Binding  of  bHLH  heterodimers  to  individual  preferred  sites, 
analyzed  by  EMSA  and  phosphorimaging.  E2A-derived  proteins  were  present  at  a 
concentration  of  8  pM,  and  Twist  and  MyoD-derived  proteins  at  19  pM.  The  indicated 
DNA  sites  that  had  been  labeled  to  the  same  specific  activity  were  present  at  550  pM. 
The  MCK-R  site  differs  from  the  others  only  at  the  positions  shown.  A  background 
species  is  indicated  by  a  closed  triangle. 

Figure  6. 

Binding  competition  analysis  of  DNA  binding  by  bHLH  heterodimers.  In  (A) 
and  (B),  binding  of  the  indicated  protein  complexes  to  the  labeled  MyoD  site  (Fig.  3D) 
was  competed  by  addition  of  an  unlabeled  binding  site  at  ratios  indicated  above  the  gel. 
These  experiments  were  performed  and  analyzed  as  in  Fig.  4,  except  that  labeled  DNA 
was  present  at  600  pM,  E12  protein  present  at  8  pM,  and  all  other  proteins  at  19  pM.  In 
(C)  and  (D),  binding  of  the  indicated  protein  complexes  to  the  labeled  Twist  site  (Fig. 

3D)  was  competed  by  addition  of  the  indicated  unlabeled  sites.  These  experiments  were 
performed  in  (A)  and  (B),  except  that  labeled  DNA  was  present  at  1.1  nM,  and  they 
were  analyzed  by  autoradiography.  Note  that  the  gel  shown  in  C  was  exposed  longer 
than  that  shown  in  D,  as  indicated  by  comparison  of  lanes  1-6.  A  backgroimd  species  is 
indicated  by  a  closed  triangle. 

Figure  7. 

Protein  titration  of  DNA  binding  by  bHLH  heterodimers.  (A)  Binding  to  the 
Twist  site,  analyzed  by  EMSA  and  phosphorimaging.  In  each  experiment,  E12  was 
present  at  8  pM  and  DNA  that  had  been  labeled  to  the  same  specific  activity  at  5  pM. 
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The  indicated  partner  proteins  were  present  at  the  concentrations  shown  above  the  gel 
(in  pM).  (B)  Binding  to  the  MCK-R  site,  analyzed  as  in  (A). 

Figure  8. 

Effects  of  bHLH  BR  and  BR-HLH  junction  residues  on  MyoD  binding 
preferences.  (A)  Mutagenesis  analysis  of  the  MyoD  BR  and  junction.  MyoD  BR  mutant 
sequences  are  compared  with  the  MyoD,  E12,  and  Twist  BRs  (Fig.  2).  Conserved  bHLH 
residues  are  shaded,  and  residues  that  are  altered  within  full-length  MyoD  are 
underlined.  (B)  Binding  of  MyoD  mutants  described  in  (A)  to  the  MyoD  preferred  site. 
These  mutants  are  compared  with  the  indicated  wild-type  proteins,  and  binding  is 
assayed  as  in  Fig.  3D,  except  that  each  protein  is  present  at  40  pM,  and  DNA  labeled  to 
the  same  specific  activity  present  at  400  pM.  E47  is  an  alternatively-spliced  E2A  protein 
that  binds  DNA  well  as  a  homodimer  (Murre  et  al.,  1989).  (C)  Binding  of  MyoD 
mutants  to  the  Twist  preferred  site,  assayed  as  in  (B). 

Figure  9. 

DNA  binding  by  E12  mutants.  DNA  binding  by  the  indicated  protein  complexes 
is  assayed  as  in  Fig.  5C,  except  that  all  E12  derivatives  are  present  at  8  pM,  and  E47  at  19 
pM.  A  protein-DNA  complex  of  intermediate  mobility  that  corresponds  to  E47-E12 
heterodimers  is  indicated  by  an  asterisk,  and  a  background  species  by  a  closed  triangle. 

Figure  10 

Cell  death  in  response  to  TTP.  (A)  Blue  cell  assay  of  TTP-induced  death.  Cells 
were  transfected  with  1.5  |xg  expression  plasmid  (either  empty  vector  CS2  or  CS2TTP), 
and  0.5  ^ig  P-gal  reporter  plasmid,  then  X-gal  stained  after  48  hours  and  shown  at  40X 
magnification.  (B)  TUNEL  assay.  Cells  were  stained  for  TUNEL  activity  at  24  hours 
after  lipofectamine  plus  transfection  (with  either  1  |ag  CS2TTP  or  CS2  vector  control 
expression  plasmid),  or  after  treatment  with  with  TNF-a  (lOOng/ ml)  and  cycloheximide 
(30|j.g/ml)  for  3  hours  as  a  positive  control.  Typical  fields  are  shown.  No  staining  was 
detected  in  a  parallel  experiment  that  lacked  the  TUNEL  reagent  (not  shown).  (C)  TTP- 
induced  cell  death  in  cell  lines  and  primary  cells  (designated  by  an  asterisk).  In  a  similar 
experiment  to  (A),  cells  were  transfected  with  200  ng  of  CS2  vector  or  CS2TTP,  and  100 
ng  of  p-gal  reporter  plasmid.  After  24  or  48  hours  they  were  X-gal  stained  and  the 
percentage  of  dead  blue  cells  was  determined.  Numbers  indicate  the  mean  of  four 
transfections  ±  the  standard  deviation.  Human  foreskin  keratinocytes  are  indicated  by 
HFK,  and  mouse  embryo  fibroblasts  by  MEF. 


Figure  11 

Cell  death  caused  by  forced  expression  of  TTP  at  modest  levels.  (A)  Expression 
of  TTP  after  serum  induction  or  transfection  of  TTP  expression  plasmid,  assayed  by 
western  blotting  with  affinity  purified  TTP  antibody.  3T3  cells  were  serum  stimulated 
(Taylor  et  al.,  1996),  or  transfected  with  the  indicated  amount  of  TTP  expression  vector 
Each  lane  contained  200  iig  total  protein.  The  43kD  species  present  in  lanes  2, 3,  and  9 
corresponds  to  TTP.  A  background  band  similar  to  that  found  in  all  lanes  in  this  gel 
(indicated  by  an  asterisk)  has  been  detected  by  a  different  antiserum  against  the  same 
peptide  (Carballo  et  al.,  1998).  The  more  slowly-migrating  TTP-specific  bands  (lanes  9 
and  10)  can  be  converted  to  faster-migrating  species  by  phosphatase  treatment  (lane 
11),  suggesting  that  they  represent  phosphorylated  TTP  forms  described  previously 
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(Taylor  et  al.,  1995).  The  western  blot  shown  in  lanes  10  and  11  was  performed  using 
TTP  antiserum  that  had  not  been  affinity  purified,  and  does  not  detect  the  background 
species  apparent  in  lanes  1-9.  (B)  TTP  expression  in  transfected  cells.  Typical  fields  are 
shown.  24  hours  after  transfection  with  the  indicated  amounts  of  TTP  vector  and  either 
100  or  200  ng  pNSeGFP,  cells  were  fixed  and  stained  with  affinity-purified  TTP  antibody 
(l|i,g/ml),  which  was  detected  with  a  Cy3-conjugated  secondary  antibody.  Nuclei  are 
revealed  by  Hoechst  staining,  and  transfected  cells  by  GFP  autofluorescence. 

Additional  GFP-positive  cells  with  low  fluorescence  levels  were  visible  imder  the 
microscope.  A  dual  image  shows  the  overlap  between  GFP  fluorescence  and  TTP 
staining.  Arrows  on  Hoechst-stained  fields  indicate  apoptotic  cells  in  which  TTP  staining 
is  indistinguishable  from  background,  and  which  are  labeled  in  the  TTP  field  with  u 
(undetectable).  Cells  that  have  high  TTP  staining  levels  are  labeled  with  h,  and  that 
have  just  visible  TTP  with  v.  (C)  TTP  staining  levels  compared  with  apoptosis.  Cells 
were  transfected  with  the  indicated  amounts  of  TTP  vector,  then  24  hr.  later  the 
percentage  of  cell  death  was  determined  as  in  Fig.  IOC.  In  a  control  transfection  lacking 
TTP  expression  vector,  5%  of  the  cells  were  apoptotic,  all  of  which  were  GFP-negative 
(not  shown).  In  a  parallel  experiment,  cells  were  transfected  on  coverslips  with  the 
same  amounts  of  TTP  expression  vector  and  the  GFP  reporter  plasmid  as  in  B.  After 
staining,  TTP  expression  was  scored  as  in  B  (u,  v,  or  h)  in  approximately  50  apoptotic 
and  300  non-apoptotic  GFP  positive  cells.  The  percentages  indicated  in  the  table  refer  to 
the  proportion  of  each  group  (apoptotic  or  non-apoptotic)  within  each  staining 
category. 


Figure  12 

The  TTP  zinc  fingers  are  required  for  apoptosis.  (A)  The  Ml  and  Ml,2  mutants, 
in  which  zinc  finger  residues  that  were  substituted  within  full-length  TTP  are 
highlighted.  (B)  TTP  mutant  expression,  assayed  by  western  blotting  following 
transfection  of  2  gg  of  the  indicated  plasmids.  Each  lane  of  the  gel,  which  was  not  rim  as 
far  as  that  shown  in  Fig.  IIA,  contained  100  gg  total  protein.  TTP  was  detected  using 
serum  that  had  not  been  affinity  purified  (Fig.  11  A).  (C)  Cell  death  caused  by  TTP 
mutants.  Cells  transfected  with  200  ng  of  the  indicated  constructs  were  assayed  for  cell 
death  48  hours  after  transfection  as  in  Fig.  IOC. 

Figure  13 

Cell  death  in  response  to  each  TTP/TISll  protein.  In  (A)  and  (B),  3T3  cells  were 
transfected  with  the  indicated  amounts  of  TTP,  TISllb,  TISlld,  or  CS2  control  vectors 
along  with  100  ng  of  p-gal  reporter,  then  cell  death  was  assayed  as  in  Fig.  IOC  after 
either  24  (A),  or  48  hours  (B).  (C)  Suppression  of  TTP/TISll-induced  apoptosis  by  Bcl-2 
and  CrmA.  200  ng  of  the  indicated  expression  vector  and  lOOng  of  p-gal  reporter  were 
co-transfected  into  HeLa  cells.  For  Bcl-2  inhibition  (white  bars),  200  ng  of  CMV-Bcl-2 
were  added,  and  for  CrmA  inhibition  (shaded  bars),  1.7  gg  of  CMV-CrmA  were  added. 
After  24  hours,  death  was  assayed  as  in  Fig.lOC.  (D)  Enhancement  of  TTP/TISll- 
induced  apoptosis  by  serum  deprivation.  3T3  cells  were  lipofectamine  plus  transfected 
with  lOOng  of  effector  plasmid  as  in  (C).  After  3  hours,  they  were  incubated  in  medium 
containing  either  0.1%  serum  (dark  gray  bars),  0.1%  serum  plus  100  ng/ml  IGF-1 
(Sigma;  hatched  bars),  0.1%  serum  plus  serum  replacement  (SR)  medium  1  (Sigma;  mid¬ 
gray  bars),  0.1%  serum  plus  IGF-1  and  SRI  (white  bars)  or  10%  serum  (black  bars).  SRI 
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medium  provides  nutrient  replacement  but  not  growth  factors.  Cell  death  was  assayed 
21  hours  later. 

Figure  14 

Synergistic  induction  of  cell  death  by  TTP  and  TNF-a.  (A)  Effects  of  TNF-a 
treatment  shortly  after  transfection.  3T3  cells  were  transfected  with  25  ng  of  the 
indicated  expression  vector  and  100  ng  of  p-gal  reporter,  using  lipofectamine  plus 
(Gibco  BRL).  Recombinant  mouse  TNF-oc  (R  &  D  Systems)  was  added  to  the  indicated 
concentration  after  3  hours,  and  19  hours  later  cell  death  was  assayed  as  in  Fig.  IOC.  (B) 
Addition  of  TNF-a  after  24  hours  of  TTP  expression.  This  experiment  was  conducted  as 
in  (A),  except  that  TNF-a  was  added  24  hours  after  transfection,  and  cell  death  assayed 
24  hours  later.  (C)  Addition  of  TNF-a  and  cycloheximide  after  24  hours  of  TTP 
expression.  This  experiment  was  conducted  as  in  B,  except  that  cycloheximide  was 
added  to  lO^g/ml  along  with  TNF-a,  and  cell  death  was  assayed  4  hours  later. 
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Figure  5B 
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A  method  has  been  developed  for  selecting  functional  enhancer/promoter  sites  from  random  DNA  sequences 
in  higher  eukaryotic  cells.  Of  sequences  that  were  thus  selected  for  transcriptional  activation  by  the  muscle- 
specific  basic  helix-loop-helix  protein  MyoD,  only  a  subset  are  similar  to  the  preferred  in  vitro  binding 
consensus,  and  in  the  same  promoter  context  an  optimal  in  vitro  binding  site  was  inactive.  Other  sequences 
with  full  transcriptional  activity  instead  exhibit  sequence  preferences  that,  remarkably,  are  generally  either 
identical  or  very  similar  to  those  found  in  naturally  occurring  muscle-specific  promoters.  This  first  systematic 
examination  of  the  relation  between  DNA  binding  and  transcriptional  activation  by  basic  helix-loop-helix 
proteins  indicates  that  binding  per  se  is  necessary  but  not  sufficient  for  transcriptional  activation  by  MyoD  and 
implies  a  requirement  for  other  DNA  sequence-dependent  interactions  or  conformations  at  its  binding  site. 
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The  basic  helix-loop-helix  (bHLH)  family  of  proteins  (48) 
includes  many  that  are  pivotal  in  cell  type  determination,  such 
as  the  MyoD  group  (including  MyoD,  myogenin,  myf-5,  and 
MRF4)  for  skeletal  myogenesis  (10,  20,  55,  73,  75).  The  bHLH 
proteins  form  homo-  or  heterodimers  through  the  HLH  do¬ 
main  and  bind  to  DNA  through  the  adjacent  basic  region  (BR) 
(16,  72).  They  bind  as  dimers  to  DNA  sites  that  generally  share 
the  consensus  CANNTG  (the  E  box)  (8,  49),  with  each  BR 
occupying  half  of  the  site  (8,  19,  21,  44). 

MyoD  family  members  form  homodimers  relatively  ineffi¬ 
ciently  (67)  and  instead  appear  to  activate  transcription  of 
muscle-specific  genes  as  heterodimers  with  the  widely  ex¬ 
pressed  E  proteins  (including  E2A  [E12  and  E47],  E2-2,  and 
HEB  [3])  (41).  E  proteins  also  form  heterodimers  with  other 
tissue-specific  bHLH  proteins  (49),  to  regulate  still  different 
sets  of  genes  in  various  tissues  such  as  erythrocytes  (1),  pan¬ 
creas  (51),  and  neurons  (36),  and  they  form  homodimers  which 
are  essential  for  B-cell  differentiation  (4, 18,  39).  The  ability  of 
these  different  protein  complexes  to  regulate  different  sets  of 
genes  appears  paradoxical,  because  their  DNA  binding  speci¬ 
ficities  are  often  surprisingly  overlapping.  For  example,  certain 
sites  that  can  be  bound  well  by  either  MyoD-E  or  E-E  com¬ 
plexes  in  vitro  (8,  49)  can  be  activated  by  only  one  or  the  other 
of  these  complexes  in  the  cell  (25,  77).  It  therefore  appears 
either  that  other  protein  factors  are  involved  in  determining 
transcriptional  specificity  or  that  subtle  differences  in  DNA 
binding  specificity  of  the  bHLH  proteins  must  be  biologically 
significant.  These  two  mechanisms  may  not  be  mutually  exclu¬ 
sive.  Evidence  suggests  that  MyoD  may  require  a  positive  co¬ 
factor  to  activate  transcription  of  appropriate  genes  (9, 16,  17, 
47,  76)  and  that  it  can  be  prevented  from  acting  at  inappro¬ 
priate  sites  by  repressors  that  recognize  bases  overlapping  the 
E  box  (25,  77).  Remarkably,  these  proposed  cofactor  and  re¬ 
pressor  functions  both  appear  to  depend  on  the  presence  of 
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particular  MyoD  BR  residues,  mutations  of  which  may  allow 
DNA  binding  but  not  transcriptional  activity.  X-ray  crystallo¬ 
graphic  studies  indicate  that  in  the  bound  complex  these  MyoD 
residues  are  in  intimate  proximity  to  the  DNA  and  that  the 
MyoD  BR  assumes  a  binding  conformation  different  from  that 
of  the  E47  protein  (19,  44),  suggesting  that  this  conformation 
may  in  turn  constitute  an  important  aspect  of  interactions 
between  MyoD  (or  other  bHLH  proteins)  and  coactivator  or 
repressor  functions  at  the  promoter. 

In  light  of  these  observations,  it  is  critical  to  study  system¬ 
atically  the  DNA  sequence  requirements  for  transcriptional 
activation  by  bHLH  proteins.  Preferred  binding  sites  for  MyoD 
and  E  proteins  have  been  identified  by  random-sequence  se¬ 
lection  in  vitro  (8,  67,  79),  but  such  assays  cannot  predict  the 
ability  of  these  proteins  to  support  transcriptional  activation. 
To  this  end,  we  have  developed  the  first  system  that  allows 
proteins  to  select  functional  DNA  targets  from  random  se¬ 
quences  in  mammalian  cells.  In  this  new  method,  dubbed  the 
selection  of  in  vivo  target  elements  (SITE)  technique  (Fig. 
lA),  oligonucleotides  with  random  sequences  at  positions  of 
interest  were  embedded  in  a  muscle-specific  promoter  cassette, 
in  this  case  the  human  cardiac  a-actin  (HCA)  enhancer/pro¬ 
moter  cassette,  located  upstream  of  a  p-galactosidase  (p-Gal) 
reporter  gene.  The  resulting  plasmid  library  was  cotransfected 
with  a  MyoD  expression  vector  into  murine  NIH  3T3  fibro¬ 
blasts,  cells  that  expressed  p-Gal  were  then  selected  by  fluo¬ 
rescence-activated  cell  sorting  (FACS),  and  finally  plasmid 
DNA  was  extracted  from  selected  cells  and  amplified  in  Esch¬ 
erichia  coli  to  allow  this  selection  procedure  to  be  repeated. 
Three  such  selection  rounds  yielded  sequences  through  which 
MyoD  can  activate  transcription  in  vivo.  These  sequences  over¬ 
lap  with  but  are  generally  distinct  from  optimal  MyoD  or 
MyoD-E  binding  sites  identified  in  vitro  (8,  79).  Instead,  the 
selected  sites  are  similar  to  various  functional  E  boxes  present 
in  native  muscle-specific  promoters.  Remarkably,  the  active 
sequences  selected  from  the  HCA  promoter  context  do  not 
necessarily  activate  transcription  in  the  context  of  another 
muscle-specific  promoter,  such  as  the  muscle  creatine  kinase 
(MCK)  promoter.  These  findings  indicate,  surprisingly,  that 
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2.  Insert  the  cassette  upstream 
of  p-gal  reporter  gene;  amplify  the 
library  in  E.  coli. 


3.  Cotransfect  with  MyoD 
into  3T3  cells. 


4.  FACS  selection  of  p-gal+  cells. 


•5-4-3-2-1  1  2  3  4  5 

HCA  5 '  -^^CTGCTCCAACTGACCCTG-^^H^3  ' 

4N  -CTGCTCNNACNNACCCTG- 

222N  -CTGCNNCANNTGNNCCTG- 

FIG.  1.  Selection  scheme.  (A)  An  oligonucleotide  was  synthesized  with  random  sequences  (designated  N)  at  positions  of  interest  in  the  protein-binding  site.  A 
double-stranded  cassette  was  generated  by  extension  over  a  complementary  segment  and  cut  with  appropriate  restriction  enzymes  at  both  ends.  The  cassette  was  then 
ligated  unidirectionally  into  a  p-Gal  reporter  plasmid  such  that  transcription  of  the  p-Gal  gene  is  under  the  control  of  the  randomized  enhancer/promoter  cassette.  The 
ligation  mbrture  was  electroporated  into  E.  coli  cells,  the  transformants  were  plated  on  LB-agarose-ampicillin  plates  and  allowed  to  form  colonies,  which  were  then 
pooled,  and  the  plasmids  were  extracted.  (The  size  of  the  plasmid  library  was  made  larger  than  4.6  X  4”,  n  being  the  number  of  nucleotides  made  random,  to  ensure 
a  probability  of  no  less  than  99%  to  encompass  the  complexity  of  the  library  4^*  [59].)  The  plasmid  pool  was  cotransfected  with  a  MyoD  expression  plasmid  into  3T3 
cells,  and  (3-GaU  cells  were  collected  by  using  a  fluorescence-activated  cell  sorter,  the  stringency  of  selection  being  controlled  by  fluorescence  gating.  The  plasmids  in 
the  collected  cells  were  retrieved  by  the  Hirt  protocol,  amplified  by  transformation  back  into  E.  coli  cells,  and  used  for  the  next  round  of  selection.  After  the  desired 
number  of  rounds  of  selection,  the  active  enhancer  sequences  were  determined  by  DNA  sequencing  and  confirmed  by  a  transfection  test.  (B)  Core  sequences  of  the 
random  cassettes  4N  and  222N.  The  18-bp  core  sequences  of  4N  and  222N  are  based  on  the  MyoD  binding  site  of  the  HCA  promoter  (46).  The  consensus  CA-TG 
motif  is  indicated  by  dots. 


the  optimal  binding  sites  are  not  always  capable  of  activating 
transcription  and  suggest  that  binding  per  se  is  not  sufficient 
for  activation,  which  may  thus  be  very  sensitive  to  promoter 
context  and/or  binding  conformation. 

MATERIALS  AND  METHODS 

Plasmid  constructions.  The  reporter  plasmid  pool  pNNN-p-gal  (Fig.  lA)  was 
constructed  by  ligating  HCA  promoter  region  -89  to  -f68  (46)  containing  each 
random  nucleotide  cassette  (Fig.  IB),  the  lacZ  reporter  gene  from  the  pNASSp 
vector  (Clontech),  and  a  fragment  from  the  pBluescript  II  KS  (+)  vector  (Strat- 
agene)  that  contained  the  ampicillin  resistance  gene  and  the  CoIEl  bacterial 
replication  origin.  Two  E  boxes  (Pvull  sites)  in  the  plasmid  backbone  were 
deleted  to  allow  unambiguous  analysis.  The  MyoD  expression  vector  used  for 
SITE  experiments,  pMyoD-Kan  (Fig.  lA),  was  derived  from  the  pEMSV-MyoD 
expression  vector  (16)  by  inserting  the  kanamycin  resistance  gene  at  the  Seal  site 
of  the  ampicillin  resistance  gene,  which  inactivates  the  latter  selection  marker. 

In  plasmid  115MCK-p-gal,  the  MCK  enhancer  region  (—1207  to  -1093)  and 
promoter  region  (-82  to  +7)  were  used  to  drive  the  laeZ  reporter  gene  in  the 
same  plasmid  backbone  as  in  pNNN-|3-gal.  The  various  E-box  substitutions  into 
the  MCK  context  (Fig.  4)  were  created  by  mutagenesis  using  PCR  (31)  and 
confirmed  by  DNA  sequence  analysis. 

Cell  culture  and  transfection.  NIH  3T3  cells  were  maintained  in  Dulbecco’s 
modified  Eagle’s  medium  supplemented  with  10%  bovine  calf  serum  (Gibco). 
Cells  were  cotransfected  with  7.5  p.g  of  p-Gal  reporter  plasmid  and  7.5  p-g  of 
MyoD  expression  vector  per  10-cm-diameter  plate  by  calcium  phosphate  copre¬ 
cipitation  (12).  Seventeen  to  21  h  posttransfection,  the  cells  were  switched  to 
differentiation  medium  (Dulbecco’s  modified  Eagle’s  medium  containing  2% 


horse  serum  [Gibco]  plus  transferrin  [Sigma;  10  pg/ml]  and  insulin  [Sigma;  10 
pg/ml])  for  2  days  before  they  were  harvested  for  FACS. 

FACS  selection  of  p-GaH  cells.  Cells  were  washed  three  times  with  lx 
phosphate-buffered  saline  (PBS),  detached  and  dissociated  in  5  mM  EDTA  in 
PBS,  and  passed  through  a  23-gauge  needle  to  further  dissociate  doublets  or 
bigger  clumps.  (EDTA  was  used  instead  of  trypsin  because  plasma  membrane 
damage  caused  by  trypsin  would  increase  leakage  during  and  after  the  hypotonic 
shock.)  The  cells  were  resuspended  in  50  pi  of  1 X  PBS  at  1  x  10^  to  5  X  10"^  cells 
per  ml,  transferred  to  Falcon  2058  tubes,  and  warmed  to  37°C.  Flow  cytometry 
analysis  of  p-Gal  activity  of  the  cells  was  performed  as  described  previously  (22, 
53).  Briefly,  50  pi  of  2  mM  fluorescein  di-p-D-galactoside  (FDG;  Molecular 
Probes)  in  H2O  was  added  to  the  cells,  and  the  mixture  was  incubated  at  37°C  for 
1  min.  At  the  end  of  the  incubation,  10  volumes  (1  ml)  of  ice-cold  IX  PBS  was 
immediately  added,  and  the  reaction  mixture  was  placed  on  ice  for  20  to  60  min. 
The  reaction  was  terminated  by  addition  of  2  ml4  phenylethyl-|3-D-thiogalacto- 
side  (Molecular  Probes).  Prior  to  sorting,  propidium  iodide  (Sigma)  was  added 
(5  pg/ml).  Cells  were  analyzed  and  sorted  at  4°C  on  a  FACS  machine  (Vantage; 
Becton  Dickinson).  Fluorescein  isothiocyanate  (FITC)-positive  cells  (in  which 
FDG  had  been  converted  to  fluorescein)  were  collected  into  a  35-mm-diameter 
Falcon  dish  coated  with  1%  SeaPlaque  agarose  in  Ix  PBS. 

Recovery  of  plasmids  from  FACS-selected  cells.  Plasmids  in  the  sorted  cells 
were  extracted  by  the  Hirt  protocol  (32).  Electroporation  of  highly  competent  E. 
coli  cells  was  used  for  efficient  recovery  of  selected  plasmids  (29).  Since  the  laeZ 
reporter  plasmid  and  the  MyoD  expression  vector  were  constructed  to  carry 
different  bacterial  selection  markers  (Fig.  lA),  the  reporter  plasmids  can  be 
selectively  amplified  in  E.  coli.  The  bacterial  transformants  were  pooled,  and 
plasmids  were  extracted  from  them  (Maxiprep;  Qiagen).  In  the  final  round  of  the 
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FIG.  2.  4N  selection  by  SITE.  (A  and  B)  DNA  sequencing  gels  showing 
random  sequences  at  desired  positions  in  the  initial  4N  plasmid  pool  (A)  and 
emergence  of  CA  and  TG  dinucleotides  from  the  background  after  SITE  (B).  (C 
and  D)  X-Gal  staining  of  cells  cotransfected  with  the  4N  template  and  MyoD 
expression  vector  before  (C)  and  after  (D)  SITE. 


222N  selection,  individual  bacterial  transformants  were  picked  arbitrarily  and 
plasmids  were  extracted  (Miniprep;  Promega). 

Sequencing  of  recovered  plasmids.  DNA  sequencing  was  done  by  the  dideoxy 
method  (61),  using  Sequenase  (U.S.  Biochemical). 

p-Gal  assays.  p-Gal  activities  of  transfected  cells  were  monitored  by  either 
5-bromo-4-chloro-3-indolylphosphate-p-D-galactopyranoside  (X-Gal)  staining  of 
fixed  cells  (60)  or  o-nitrophenyl-p-o-galactopyranoside  (ONPG)  and  Galacton- 
Plus  (Tropix)  hydrolysis  by  cell  lysates,  in  accordance  with  standard  protocols 
(59)  or  the  manufacturer’s  instructions,  respectively. 

RESULTS  AND  DISCUSSION 

The  CANNTG  motif  is  selected  by  MyoD  to  activate  tran¬ 
scription.  A  transient-transfection  approach  was  used  both  to 
generate  MyoD-expressing  cells  and  to  select  MyoD-respon- 
sive  sequences.  This  strategy  was  chosen  because  stable  expres¬ 
sion  of  MyoD  inhibits  the  cell  cycle  (15, 27,  28,  66)  and  because 
myoblast  cells  that  constitutively  express  MyoD  may  fuse,  thus 
complicating  selection  of  individual  cells.  The  selection  tem¬ 
plates  (Fig.  IB)  were  designed  on  the  basis  of  the  HCA  pro¬ 
moter  (46),  which  is  specific  to  skeletal  and  cardiac  muscle  and 
contains  a  single  E  box  that  is  required  for  MyoD-dependent 
HCA  promoter  activity  in  skeletal  muscle  (62,  65).  To  test  the 
feasibility  and  fidelity  of  the  SITE  approach,  we  first  used  the 
4N  promoter  template  (Fig.  IB),  in  which  the  CA  and  TG 
positions  of  the  E  box  are  random.  When  NIH  3T3  cells  were 
cotransfected  with  the  4N  pool  and  a  MyoD  expression  vector, 
^-Gal  activity  was  close  to  the  basal  level  (Fig.  2C),  suggesting 
that  only  a  very  small  portion  (if  any)  of  the  random-sequence 
combinations  were  activated  by  MyoD.  In  contrast,  after  two 
rounds  of  selection,  the  pool  p-Gal  activity  was  dramatically 
increased  (Fig.  2D),  and  the  DNA  sequence  of  the  recovered 


plasmids  revealed  selection  for  the  CA  and  TG  dinucleotides 
(compare  Fig.  2B  with  Fig.  2A),  thus  confirming  both  their 
importance  and  the  validity  of  the  selection  system. 

DNA  sequence  requirement  for  MyoD  transactivation 
through  the  CANNTG  motif.  To  identify  E  boxes  through 
which  MyoD  can  activate  transcription,  we  used  the  222N 
template  (Fig.  IB),  in  which  base  pairs  of  the  HCA  promoter 
internal  to  and  flanking  the  CANNTG  consensus  are  of  ran¬ 
dom  sequence.  In  preliminary  studies,  we  determined  by  quan¬ 
titative  p-Gal  assay  (59)  that  the  wild-type  HCA  sequence  can 
generate  signals  at  least  50-fold  over  the  level  in  nontrans- 
fected  cells  or  cells  transfected  with  a  mutant  construct.  Con¬ 
sequently,  we  set  the  FACS  gates  for  FITC-positive  cells  at 
levels  well  above  basal  levels  in  order  to  allow  selection  of  cells 
with  only  the  highest  levels  of  p-Gal  activity,  which  probably 
contain  multiple  copies  of  the  most  active  sequences.  In  the 
first  round,  the  cells  that  were  cotransfected  with  the  222N 
population  and  MyoD  turned  the  solutions  slightly  yellow  (as  a 
result  of  FITC  leakage)  during  the  loading  of  FDG,  suggesting 
that  the  222N  pool  contained  a  considerable  portion  of  active 
sequence.  However,  this  activity  was  clearly  much  lower  than 
that  observed  for  the  wild-type  sequence.  Three  rounds  of 
increasingly  stringent  FACS  selection  yielded  a  sequence  pool 
with  activity  that  was  significantly  higher  than  that  of  the  start¬ 
ing  population,  as  is  apparent  from  the  appearance  of  a  new 
peak  in  the  FACS  spectrum  (Fig.  3A).  This  dramatic  improve¬ 
ment  in  pool  activity  is  also  indicated  by  much  more  leaking  of 
visible  FITC  from  the  cells  during  the  FDG  loading  procedure 
in  the  final  round. 

When  individual  plasmids  recovered  from  this  pool  were 
assayed  for  transcriptional  activation  by  cotransfected  MyoD, 
most  (15  of  17)  were  active,  some  (4  of  17)  even  more  so  than 
the  wild-type  HCA  promoter.  Of  this  last  most  active  group, 
two  (sequences  5  and  15)  were  represented  multiple  times 
(Fig.  3B).  None  of  these  selected  sequences  were  active  with¬ 
out  cotransfection  of  the  MyoD  expression  vector  or  could  be 
activated  by  expression  of  the  E  proteins  alone.  Although  a 
background  of  inactive  plasmids  (2  of  26)  was  still  present  in 
the  selected  pool,  these  results  confirm  that  the  selection  pro¬ 
cedure  enriched  significantly  for  MyoD-responsive  sequences 
(Fig.  3B). 

Differences  between  in  vivo  activation  and  in  vitro  DNA 
binding.  Only  a  subset  of  the  possible  E-box  CANNTG  hex- 
amer  core  site  motifs  were  represented  among  the  selected 
molecules  that  were  responsive  to  MyoD,  none  of  which  con¬ 
tained  the  CACGTG  hexamers  to  which  MyoD  or  MyoD-E 
complexes  are  known  not  to  bind  (7).  Of  the  10  sequences  that 
were  at  least  half  as  active  as  the  wild-type  HCA  promoter,  one 
contained  the  wild-type  CAACTG  hexamer  core  sequence, 
four  contained  a  CAAGTG  hexamer,  and  five  contained  a 
CAGGTG  sequence.  Previously,  binding-site  selections  per¬ 
formed  in  vitro  demonstrated  that  MyoD  homodimers  and 
MyoD-E47  heterodimers  prefer  sites  that  contain  the 
CAGCTG  and  CAGGTG  motifs,  respectively,  and  identified 
their  preferred  sequences  at  positions  flanking  these  elements 
(Fig.  3D)  (8).  It  is  remarkable  that  only  a  subset  of  the  se¬ 
quences  selected  here  in  vivo  contained  a  CAGGTG  motif  and 
that  relatively  few  (4  of  15)  of  the  active  plasmids  contained  the 
base  preferred  by  MyoD  at  position  —4  or  +4.  However,  the 
majority  of  the  sequences  selected  by  this  method  are  identical 
or  very  similar  to  E  boxes  in  the  native  promoters  of  the 
transcriptional  regulators  MyoD,  myogenin,  and  MRF4,  as 
well  as  in  the  promoters  of  many  muscle  structural  protein 
genes.  For  instance,  the  hexamer  core  CAAGTG,  which  was 
not  identified  through  in  vitro  binding  selection  but  was  recov¬ 
ered  repeatedly  by  SITE,  has  been  observed  to  date  only  in  the 
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LABEL 

FREQUENCY 

SEQUENCE 

#26 

1 

GTCAGGTGTG 

#5 

8 

ATCAAGTGCA 

#18 

1 

ACCACCTGCA 

#15 

3 

ATCAAGTGCT 

#40 

1 

GGCAGGTGTC 

#28 

1 

AGCAGGTGTG 

#27 

1 

TGCAAGTGAG 

#9 

1 

GGCAGTTGAA 

#30 

1 

GGCAACTGGA 

#36 

1 

GTCAGGTGGC 

#8 

1 

GACAGATGGG 

#17 
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FIG.  3.  SITE-selected  sequences  and  transcriptional  activities  from  the  222N  template.  (A)  FACS  sorting  of  the  222N  pool.  The  sort  gates  for  p-Gal  activity  were 
set  such  that  the  ratios  of  FITC  fluorescence  intensity  of  the  gate  over  that  of  the  bulk  of  p-Gal-negative  cells  were  150,  500,  and  700  for  the  first,  second,  and  third 
rounds,  respectively.  Note  that  the  fluorescence  intensity  axis  is  logarithmic.  Dead  cells  (propidium  iodide  bright)  were  eliminated  through  a  second  fluorescence  gating. 
Cell  debris,  doublets,  and  larger  clumps  were  gated  out  by  forward-  and  side-scattering  settings.  The  upper  panels  are  histograms  of  the  FITC  intensities  of  all  cells; 
the  lower  panels  show  the  sorted  cells  while  they  were  accumulating  over  an  arbitrary  period  during  the  sorting.  The  cells  containing  FITC  levels  higher  than  the  sort 
gate  were  collected,  and  the  plasmid  DNA  was  extracted  and  amplified  in  E.  coli  cells  in  order  to  do  the  next  round  of  SITE.  (B)  Transcriptional  activation  by  sequences 
selected  from  the  third  round  of  SITE.  The  plasmids  recovered  from  the  third  round  of  the  selection  were  transfected  individually  with  the  MyoD  expression  vector 
into  3T3  cells.  3-Gal  activity  was  measured  from  the  cell  lysates,  using  ONPG  as  the  substrate.  Each  datum  point  was  the  average  of  at  least  three  independent 
transfections.  The  data  were  plotted  as  the  fold  activation  by  MyoD  over  that  of  pEMSVscribe  (MyoD  control  vector).  The  Anecdote  column  lists  known 
muscle-specific  enhancer/promoters  that  contain  E  boxes  with  sequences  identical  or  very  similar  to  those  selected  by  SITE.  Only  E  boxes  that  were  tested  to  be 
functional  in  the  literature  are  listed.  Note  that  in  sequence  5,  although  the  3'  flanking  sequence  CA  can  in  theory  be  grouped  with  nucleotides  immediately  downstream 
to  form  another  E  box  (CACCTG  site),  this  second  E  box  is  nonfunctional  (data  not  shown).  mAchR-y,  mouse  acetylcholine  receptor  y  subunit;  DRRl,  distal  regulatory 
region  1;  PRR,  proximal  regulatory  region;  sTnC,  smooth  muscle  troponin  C;  MLC,  myosin  light  chain;  w.t.,  wild  type;  mut,  mutant.  (C)  Alignment  of  selected  E-box 
sequences  to  known  enhancer/promoter  sequences  in  muscle-specific  genes.  (— ),  the  sequence  displayed  is  antisense  to  the  template  strand.  (D)  Summary  of  in  vitro 
DNA  binding  preferences  of  myogenic  bHLH  and  E  proteins.  *,  data  from  reference  8;  t,  data  from  reference  24;  t,  data  from  reference  80.  A  line  drawn  over  an 
uppercase  letter  indicates  a  base  that  is  absent  in  that  position,  and  a  line  over  a  lowercase  letter  indicates  a  decrease  in  use  of  that  base. 


MyoD  promoter.  Among  them,  sequence  5  resembles  the 
MyoD  distal  regulatory  region  1,  which  is  important  for  trans¬ 
activation  of  the  MyoD  promoter  (69),  and  sequence  15  is  very 
similar  to  the  E  box  in  the  MyoD  proximal  regulatory  region, 
which  is  indispensable  for  both  distal  regulatory  region  trans¬ 
activation  (69)  and  autoregulation  (80)  by  MyoD.  Although 
the  disproportionate  representation  of  sequences  5  and  15 
suggests  that  the  arbitrarily  chosen  number  of  cells  from  which 
plasmids  were  recovered  may  have  contained  sequences  of 
limited  complexity,  a  variety  of  different  sequences  that  indeed 
differ  from  the  preferred  in  vitro  consensus  were  selected. 

Perhaps  the  most  striking  aspect  of  the  selected  MyoD- 
responsive  sequences  is  that  they  closely  resemble  functional  E 
boxes  that  are  present  in  the  promoters  of  muscle-specific 
genes  (Fig.  3B  and  C)  (5,  13, 14,  23,  26,  38,  40,  42,  43,  57,  63, 
78)  but  not  those  found  in  the  target  genes  of  nonmuscle 
bHLH  proteins  (see  references  in  reference  8).  This  resem¬ 
blance  is  apparent  not  only  in  their  core  hexamers  but  also  in 
flanking  sequences,  many  with  identical  pentamer  half-sites 
(Fig.  3C).  In  their  natural  promoter  context,  they  are  also 
capable  of  mediating  transcriptional  activation  by  the  myo¬ 
genic  bHLH  proteins,  which  apparently  function  as  het¬ 
erodimers  with  E  proteins.  These  muscle-specific  sites  can,  in 
general,  be  bound  in  vitro  by  MyoD  and  the  other  myogenic 
bHLH  proteins  and  by  heterodimers  of  these  proteins  with  E 
proteins.  The  MyoD-responsive  sequences  selected  here  by 
SITE  can  similarly  be  bound  by  MyoD-E  heterodimers  (al¬ 
though  the  in  vitro  binding  affinities  do  not  parallel  the  in  vivo 
transcriptional  activity  [data  not  shown]),  and  when  we  exam¬ 
ined  them  in  the  cotransfection  assay,  we  found  that  they 
mediate  activation  by  a  tethered  MyoD-E47  dimer  (52)  to 
levels  no  higher  than  those  characteristic  of  cotransfected 
MyoD  (data  not  shown).  The  simplest  interpretation  of  these 
findings  is  that  these  selected  sites  are  direct  targets  of  het¬ 
erodimers  of  cellular  E  proteins  with  the  cotransfected  MyoD 
or  with  other  myogenic  bHLH  proteins  that  may  be  activated 
by  MyoD  (see  below). 

The  SITE  data  thus  support  the  notion  that  DNA  binding 
alone  is  not  sufficient  for  a  bound  MyoD  (or  other  myogenic 
bHLH  protein)-E  heterodimer  to  be  transcriptionally  active. 
To  confirm  that  this  SITE  experiment  did  not  somehow  simply 
miss  functional  sequences  that  more  closely  resemble  optimal 
MyoD  and  MyoD-E  in  vitro  binding  sites,  we  tested  such  sites 
(Fig.  3B)  in  the  HCA  cassette  for  transactivation  by  MyoD.  In 
this  assay,  surprisingly,  an  optimal  in  vitro  MyoD-E  site  was 
inactive,  and  the  activity  of  a  preferred  MyoD  homodimer  site 
was  very  low  (Fig.  3B).  On  the  other  hand,  SITE  recovered 
some  transcriptionally  active  plasmids  bearing  a  sequence  that 
was  strongly  selected  against  by  in  vitro  methods,  e.g.,  T  at  the 
-4  position  of  the  E47  protein  half-site  (8,  67).  These  data 


indicate  that  the  E  boxes  that  are  most  responsive  to  MyoD  in 
vivo  do  not  necessarily  correspond  to  the  in  vitro  binding 
preferences  of  MyoD  or  MyoD-E  complexes. 

Role  of  promoter  context  in  MyoD  transactivation  through 
different  E  boxes.  One  possible  explanation  for  differences  in 
vivo  and  in  vitro  is  that  in  vivo  repressors  might  bind  to  certain 
E-box  sequences  and  prevent  MyoD  from  binding  to  these 
sites  and  activating  transcription.  Alternatively,  differences 
among  interacting  proteins  on  these  promoters  might  affect  the 
conformation  of  the  DNA  and/or  of  the  protein-DNA  complex 
in  a  way  that  prevents  particular  E-box  sequences  from  acting 
as  an  effective  enhancer.  To  distinguish  between  these  possi¬ 
bilities,  we  compared  the  abilities  of  different  E  boxes  to  acti¬ 
vate  transcription  in  the  HCA  promoter  and  the  MCK  en¬ 
hancer/promoter.  The  3,300-bp  enhancer/promoter  of  MCK 
contains  two  E  boxes  that  are  targets  of  MyoD  activation  (37, 
74).  A  115-bp  segment  including  these  E  boxes  that  retains  all 
of  the  muscle-specific  activity  of  the  enhancer  (35)  was  used  for 
the  E-box  substitution  experiment  (Fig.  4A),  the  results  of 
which  are  shown  in  Fig.  4B.  The  sequence  5  E-box  decamer,  a 
strong  binding  site  in  vitro  and  a  strong  activation  site  in  the 
HCA  promoter,  was  only  moderately  active  when  replacing  the 
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FIG.  4.  Comparison  of  transcriptional  activities  of  different  E  boxes  in  the 
MCK  and  HCA  promoter  contexts,  (A)  Molecular  structure  and  enhancer  se¬ 
quence  of  the  MCK  enhancer  and  promoter.  Only  the  two  E  boxes  (L  and  R 
sites)  are  shown.  (B)  Comparison  of  transcriptional  activities  of  different  E  boxes 
in  the  MCK  and  HCA  promoter  contexts.  X’s  stand  for  different  E-box  se¬ 
quences;  HCA  (X)  denotes  X  replacing  the  E  box  in  the  HCA  promoter,  MCK 
(L-X)  denotes  X  replacing  the  R  site  in  the  MCK  enhancer,  and  MCK  (2X) 
denotes  X  replacing  both  the  L  and  R  sites  in  the  MCK  enhancer.  Each  datum 
point  was  derived  from  transfection  of  two  independent  clones,  3-Gal  activity 
was  measured  from  cell  lysates  by  using  Galacton-Plus  (Tropix)  as  the  substrate. 
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right  (R)  site  of  the  MCK  enhancer.  Its  activity  was  even  lower 
if  both  left  (L)  and  R  sites  were  replaced.  On  the  other  hand, 
the  in  vitro  optimal  binding  site  (construct  aa),  although  inac¬ 
tive  in  the  HCA  context,  was  the  most  active  in  the  MCK 
context.  It  is  interesting  that  the  HCA  E  box,  a  poor  binding 
site  in  vitro,  acts  as  a  strong  activation  site  both  in  its  native 
context  and  when  replacing  the  L  and  R  sites  simultaneously  in 
the  MCK  enhancer.  Thus,  these  data  for  the  MCK  promoter 
confirmed  the  notion  that  the  ability  of  an  E-box  enhancer  to 
support  transcriptional  activation  cannot  be  predicted  from  its 
affinity  for  the  protein  factors.  These  results  demonstrate  that 
promoter  context  is  an  important  determinant  of  activation 
properties  of  different  E-box  sequences. 

It  is  interesting  that  a  majority  (20  of  26)  of  the  sites  that 
were  selected  bear  a  MyoD  half-site  on  the  5'  end  in  proximity 
to  the  Spl  site  on  the  HCA  promoter  (Fig.  IB  and  3B).  How¬ 
ever,  it  is  unlikely  that  Spl  could  be  the  determinant  of  context 
specificity  because  the  MCK  enhancer  also  contains  an  Spl 
site.  Spl  sites  are  also  found  adjacent  to  MyoD  half-sites  in  the 
promoters  of  the  acetylcholine  receptor  5  subunit  (AchR-8), 
myogenin,  troponin,  and  MyoD  genes,  although  the  impor¬ 
tance  of  these  Spl  sites  has  not  been  uniformly  evaluated. 
Thus,  it  is  not  obvious  why  the  activities  of  some  E  boxes  are 
so  different  in  the  HCA  and  MCK  promoters.  Our  results  thus 
form  a  basis  for  further  investigation  of  cw-acting  elements  that 
are  involved  in  determining  promoter  context  specificity. 

Application  of  the  SITE  method.  The  SITE  technique  de¬ 
scribed  here  has,  for  the  first  time,  used  the  power  of  random- 
sequence  selection  (6,  54,  64,  68)  in  metazoan  cells  to  identify 
functional  regulatory  sites.  It  can  yield  sequences  that  are  re¬ 
sponsive  to  any  transcription  factor,  or  to  a  protein  that  acti¬ 
vates  such  a  factor,  provided  that  either  the  cloned  gene  or  a 
cell  line  expressing  the  factor  is  available.  The  reiterative  se¬ 
lection  process  permits  recovery  of  a  large  number  of  sites  that 
mediate  a  range  of  transcriptional  activities,  with  more  strin¬ 
gent  selection  yielding  more  active  sites.  This  method  is  likely 
to  yield  biologically  significant  sequences  that  might  not  be 
identified  by  in  vitro  methods,  because  they  are  selected  in  the 
presence  of  the  full  complement  of  cellular  factors  under  phys¬ 
iological  conditions  and  can  thus  provide  rational  guidance  for 
designing  reporter  constructs.  As  such,  the  in  vivo  selection 
method  provides  an  approach  complementary  to  the  in  vitro 
method  for  discovering  target  sequences  for  regulatory  pro¬ 
teins.  A  more  important  benefit  of  this  strategy  is  that  it  allows 
a  systematic  comparison  of  the  functional  requirements  of 
proteins  under  different  circumstances  and  conditions  and  can 
thus  illuminate  critical  regulatory  mechanisms.  For  example,  a 
comparison  of  the  MyoD-responsive  sequences  identified  in 
this  study  with  the  preferred  in  vitro  binding  sites  of  MyoD 
suggests  the  existence  of  particular  sequence  requirements  for 
activation.  The  versatility  of  the  SITE  approach  will  now  make 
it  possible  to  undertake  similar  investigations  of  activation  by 
MyoD  (and  its  BR  variants)  in  the  context  of  other  promoters 
and  associated  factors  and  to  identify  similarly  sequences  that 
respond  to  other  bHLH  proteins. 

The  SITE  technique  should  be  adaptable  to  a  variety  of 
other  experimental  situations.  For  example,  repression  se¬ 
quences  could  be  selected,  provided  that  a  screening  step  first 
identified  cells  that  contain  the  reporter  library  DNA.  It  should 
be  similarly  possible  to  select  native  regulatory  sequences  from 
a  library  of  genomic  DNA  fragments  or,  if  transcribed  posi¬ 
tions  were  randomized,  to  identify  sequences  involved  in  RNA 
processing  or  translational  control.  If  positions  within  coding 
sequences  were  randomized,  a  selection  could  identify  the 
amino  acid  residues  that  are  encoded  by  the  corresponding 
positions  which  allow  a  transcription  factor  to  function,  as  has 


been  done  in  bacteria  (34)  and  in  Saccharomyces  cerevisiae 
(56).  In  principle,  the  number  of  positions  that  can  be  random¬ 
ized  in  the  SITE  technique  should  be  limited  only  by  the 
number  of  cells  that  can  be  handled.  For  example,  eight  ran¬ 
dom  positions  will  generate  4^  (about  6.6  X  lO'^)  different 
plasmid  molecules.  If  one  molecule  were  introduced  per  cell, 
screening  of  a  library  of  this  size  would  require  on  the  order  of 
3  X  10^  cells  (59),  a  number  readily  attainable  by  transient 
transfection.  However,  the  efficiency  of  screening  is  actually 
higher,  because  each  cell  takes  up  multiple  plasmids  and  be¬ 
cause  even  at  a  high  input  copy  number,  positive  cells  can  be 
identified  by  FACS,  which  can  detect  as  few  as  5  p-Gal  enzyme 
molecules  per  cell  (53),  is  rapid  (1/10^  cells  per  h),  and  can  be 
applied  to  different  markers  (e.g.,  the  cell  surface  molecule 
CD4  [45]  or  green  fluorescent  protein  [GFP]  [11]).  Multiple 
markers  can  be  used  either  alone  or  simultaneously  to  examine 
different  sets  of  regulatory  sequences.  Alternatively,  the  tech¬ 
nique  can  be  applicable  to  other  transient  selection  markers 
(e.g.,  puromycin  resistance  [71]). 

Specificity  of  MyoD-mediated  transcriptional  activation.  Al¬ 
though  the  sequences  identified  here  are  responsive  to  MyoD, 
it  is  possible  that  these  sequences  actually  are  or  include  the 
direct  targets  of  another  myogenic  bHLH  protein.  In  cultured 
cells,  expression  of  a  myogenic  bHLH  protein  can  activate  the 
corresponding  endogenous  gene  along  with  other  members  of 
this  group,  with  the  particular  myogenic  genes  activated  vary¬ 
ing  among  different  lines  (2,  33,  70).  In  the  3T3  cells  used  in 
this  study,  MyoD  generally  does  not  activate  the  endogenous 
MyoD,  myf-5,  or  MRF4  gene  but  can  activate  the  myogenin 
gene.  In  the  mouse,  myf-5  and  MyoD  determine  the  myoblast 
cell  fate  (58),  while  myogenin  is  required  to  activate  the  full 
spectrum  of  sarcomeric  genes  (30,  50),  but  in  cells  cultured 
from  myogenin  ^  mice,  MyoD  can  also  activate  these  genes 
(50).  Accordingly,  we  cannot  determine  whether  in  our  exper¬ 
iments  myogenin  might  have  also  contributed  to  the  activation 
of  sequences  selected.  However,  it  is  unlikely  that  only  myo¬ 
genin  (but  not  MyoD)  is  the  direct  activator,  because  when  the 
selected  sequences  were  cotransfected  with  a  myogenin  expres¬ 
sion  vector,  they  were  activated  to  levels  only  half  as  high  as 
those  achieved  by  MyoD.  As  these  sites  are  bound  by  MyoD-E 
heterodimers  in  vitro  and  activated  similarly  by  MyoD  and 
MyoD-E  tethered  dimers  in  vivo  (data  not  shown),  and  since 
the  DNA  binding  requirements  of  MyoD  in  vitro  and  those  of 
myogenin  with  the  nuclear  extracts  are  virtually  identical  (Fig. 
3D)  (8,  79),  the  most  likely  conclusion  appears  to  be  that  the 
selected  regulatory  sites  may  be  the  direct  targets  of  het¬ 
erodimers  of  either  MyoD  or  myogenin  with  endogenous  cel¬ 
lular  E  proteins. 

The  group  of  sequences  that  were  identified  in  this  experi¬ 
ment  share  some  attributes  with,  but  are  not  identical  to,  the 
preferred  in  vitro  binding  sites  for  MyoD  (or  myogenin)  or 
MyoD-E  complexes.  In  the  HCA  promoter,  these  bHLH  com¬ 
plexes  might  bind  cooperatively  with  neighboring  proteins,  and 
these  interactions  might  have  contributed  enough  binding  en¬ 
ergy  to  have  relaxed  the  sequence  requirements  for  activation. 
However,  two  lines  of  evidence  argue  against  this  interpreta¬ 
tion.  First,  an  optimal  site  was  not  capable  of  supporting  acti¬ 
vation  in  this  context  (Fig.  3B).  Second,  the  selected  sequences 
were  actually  less  similar  to  the  in  vitro  preferences  than  to  E 
boxes  found  in  the  native  promoters  of  muscle-specific  genes. 
These  native  E  boxes  have  been  shown  to  be  important  for 
auto-  and/or  cross-activation  of  the  myogenic  bHLH  genes  or 
for  activation  of  muscle  structural  genes  by  these  bHLH  pro¬ 
teins,  suggesting  that  the  sequences  selected  here  by  SITE  are 
biologically  significant. 

We  have  selected  E-box  sequences  that  are  able  to  support 
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transcriptional  activation  by  muscle-specific  bHLH  protein 
MyoD.  The  data  indicate  that  only  a  subset  of  MyoD  binding 
sites  allow  activation  of  transcription  in  the  HCA  promoter 
context  and  that  certain  E  boxes  function  differently  in  differ¬ 
ent  muscle-specific  promoters.  These  findings  demonstrate 
that  promoter  context  plays  an  important  role  in  transcrip¬ 
tional  activation  by  MyoD  through  E  boxes.  Promoter  context 
could  affect  transcriptional  activation  in  several  ways.  For  ex¬ 
ample,  the  neighboring  protein  factors  could  influence  the 
conformation  of  the  promoter  DNA  and/or  of  the  protein-E 
box  complex;  different  proteins  could  also  recruit  distinct  as¬ 
sociated  factors  to  the  promoters.  In  addition,  our  findings  are 
reminiscent  of  the  positive-control  MyoD  BR  mutations  (17, 
76),  which  also  appear  to  permit  DNA  binding  but  not  activa¬ 
tion.  Significantly,  these  same  BR  residues  not  only  appear  to 
influence  the  conformation  of  the  protein-DNA  complex  (19, 
44)  but  also  have  been  implicated  in  interactions  with  proposed 
regulatory  factors  (9,  16,  17,  47,  76).  The  findings  described 
here  indicate  that  some,  or  perhaps  all,  of  the  potential  regu¬ 
latory  interactions  require  specific  DNA  sequences,  which 
themselves  could  influence  the  conformation  of  the  bound 
MyoD-E  complex.  It  remains  to  be  determined  whether  such 
conformational  changes  might  provide  distinctive  targets  for 
coactivators  or  repressors  or  would  derive  from  such  interac¬ 
tions. 
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Abstract 

Tristetraprolin  (TTP),  the  prototype  of  a  class  of  CCCH  zinc  finger  proteins,  is  a  phosphoprotein  that  is  rapidly  and  transiently 
induced  by  growth  factors  and  serum  in  fibroblasts.  Recent  evidence  suggests  that  a  physiological  function  of  TTP  is  to  inhibit 
tumor  necrosis  factor  a  secretion  from  macrophages  by  binding  to  and  destabilizing  its  mRNA  (Carballo,  E.,  Lai,  W.S., 
Blackshear,  P.J.,  1998.  Science,  281,  1001-1005).  To  investigate  possible  functions  of  CCCH  proteins  in  early  development  of 
XenopuSy  we  isolated  four  Xenopus  cDNAs  encoding  members  of  this  class.  Based  on  49%  overall  amino  acid  identity  and  84% 
amino  acid  identity  within  the  double  zinc  finger  domain,  one  of  the  Xenopus  proteins  (XC3H-1)  appears  to  be  the  homologue 
of  TTP.  By  similar  analyses,  XC3H-2  and  XC3H-3  are  homologues  of  ERF-1  (cMGl,  TISllB)  and  ERF-2  (TISllD).  A  fourth 
protein,  XC3H-4,  is  a  previously  unidentified  member  of  the  CCCH  class  of  vertebrate  zinc  finger  proteins;  it  contains  four 
CJC8CX5CX3H  repeats,  two  of  which  are  YKTEL  CxgCxgCxsH  repeats  that  are  closely  related  to  sequences  found  in  the  other 
CCCH  proteins.  Whereas  XC3H-1,  XC3H-2,  and  XC3H-3  were  widely  expressed  in  adult  tissues,  XC3H-4  mRNA  was  not 
detected  in  any  of  the  adult  tissues  studied  except  for  the  ovary.  Its  expression  appeared  to  be  limited  to  the  ovary,  oocyte,  egg 
and  the  early  embryonic  stages  leading  up  to  the  mid-blastula  transition.  Its  mRNA  was  highly  expressed  in  oocytes  of  all  ages, 
and  was  enriched  in  the  animal  pole  cytosol  of  mature  oocytes.  Maternal  expression  was  also  seen  with  the  other  three  messages, 
suggesting  the  possibility  that  these  proteins  are  involved  in  regulating  mRNA  stability  in  oocyte  maturation  and/or  early 
embryogenesis.  ©  1999  Elsevier  Science  B.V.  All  rights  reserved. 

Keywords:  AU-rich  element;  Deadenylation;  Mid-blastula  transition;  mRNA  stability 


1.  Introduction 

Members  of  the  CCCH  class  of  zinc  finger  proteins 
contain  two  or  more  zinc  finger  motifs,  each  with  a 
cysteine-histidine  repeat  in  a  cys-cys-cys-his  (CCCH) 
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Abbreviations:  ARE,  AU-rich  element;  bp,  base  pair(s);  CPSF,  cleav¬ 
age  and  polyadenylation  specificity  factor;  HCG,  human  chorionic 
gonadotropin;  kb,  kilobase(s);  MBS,  modified  Barth’s  saline;  MBT, 
mid-blastula  transition;  PABP,  poly  A  binding  protein;  RACE,  rapid 
amplification  of  cDNA  ends;  TNFa,  tumor  necrosis  factor  a;  TTP, 
tristetraprolin;  UTR,  untranslated  region. 


configuration.  Members  of  a  subclass  of  this  family 
contain  two  putative  zinc  fingers  with  tandem  YKTEL 
CxqCx^Cx^H  repeats  (where  x  is  a  variable  amino  acid), 
in  which  the  H  of  the  first  finger  is  separated  from  the 
first  C  of  the  second  finger  by  18  amino  acids.  Three 
members  of  this  group  have  thus  far  been  identified  in 
mammals:  Tristetraprolin  (TTP)  (Lai  et  ah,  1990),  also 
known  as  TISll  (Varnum  et  al.,  1989;  Ma  and 
Herschman,  1991),  Nup475  (DuBois  et  ah,  1990),  and 
G0S24  (Heximer  and  Forsdyke,  1993);  cMGl 
(Gomperts  et  al.,  1990),  also  known  as  TISl  IB  (Varnum 
et  al.,  1991),  ERF-1  (Barnard  et  al.,  1993),  and  Berg36 
(Ning  et  al.,  1996);  and  TISllD  (Varnum  et  al.,  1991), 
also  known  as  ERF-2  (Nie  et  al.,  1995).  The  structure 
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of  one  of  the  previously  hypothetical  zinc  fingers  has 
recently  been  resolved  by  nuclear  magnetic  resonance 
spectroscopy;  this  finger  has  been  shown  to  bind  zinc 
with  high  affinity  (Worthington  et  al,  1996).  Other 
members  of  this  gene  family  have  been  identified  in 
organisms  as  diverse  as  Drosophila,  C  elegans,  and  yeast 
(Mello  et  al.,  1992,  1996;  Ma  et  al.,  1994;  Warbrick  and 
Glover,  1994;  Seydoux  et  al.,  1996;  Thompson  et  al., 
1996;  Guedes  and  Priess,  1997). 

TTP,  the  best  studied  member  of  this  family,  derives 
its  name  from  three  PPPPG  motifs  that  are  highly 
conserved  in  mammals  (Taylor  et  al.,  1995),  It  is 
encoded  by  Zfp36  in  mouse  and  ZFP36  in  man,  which 
have  been  mapped  to  chromosomes  7  and  19ql3.1 
respectively  (Taylor  et  al,  1991).  In  fibroblasts,  it  is 
rapidly  induced  at  the  level  of  transcription  by  several 
mitogens,  including  insulin,  serum,  platelet-derived 
growth  factor,  fibroblast  growth  factor,  and  phorbol 
12-myristate  13-acetate  (Varnum  et  al.,  1989;  DuBois 
et  aL,  1990;  Lai  et  al.,  1990).  In  addition,  TTP  is  rapidly 
phosphorylated  on  serine  residues  after  stimulation  of 
cells  by  the  same  mitogens  (Taylor  et  al.,  1995),  which 
also  rapidly  stimulate  translocation  of  TTP  from  the 
nucleus  to  the  cytoplasm  in  fibroblasts  (Taylor  et  al., 
1996a,b).  Because  there  is  evidence  that  TTP  binds 
(DuBois  et  al.,  1990;  Worthington  et  al.,  1996) 
and  is  localized,  at  least  part  of  the  time,  in  the  nucleus 
(DuBois  et  al.,  1990;  Taylor  et  al.,  1996a,b),  it  has  been 
suggested  that  TTP  may  be  a  nuclear  transcription 
factor  like  the  other  immediate-early  response  genes. 

Mice  made  deficient  in  TTP  by  gene  targeting 
appeared  normal  at  birth  but  soon  developed  a  complex 
syndrome  that  included  cachexia,  dermatitis,  erosive 
arthritis,  autoimmunity,  and  myeloid  hyperplasia;  essen¬ 
tially  all  aspects  of  this  phenotype  could  be  prevented 
by  the  injection  of  antibodies  to  tumor  necrosis  factor 
a  (TNFa)  (Taylor  et  al.,  1996a,b).  TTP  was  later  found 
to  inhibit  TNFa  production  from  macrophages  and 
perhaps  other  cells,  in  that  macrophages  from  TTP- 
knockout  mice  produced  excessive  amounts  of  TNFa 
(Carballo  et  al.,  1997).  This  was  associated  with 
increased  cellular  concentrations  of  TNFa  mRNA 
(Carballo  et  al.,  1997). 

The  mechanism  of  this  increase  in  TNFa  mRNA 
levels  was  recently  found  to  be  due,  at  least  in  part,  to 
stabilization  of  TNFa  mRNA  in  the  TTP-deficient 
macrophages  (Carballo  et  al.,  1998).  This  was  associated 
with  direct  binding  of  TTP  to  the  AU-rich  element 
(ARE)  of  the  TNFa  mRNA  (Carballo  et  al.,  1998). 
Thus,  TTP  appears  to  enhance  the  breakdown  of  TNFa 
mRNA  and  other  messages  containing  similar  AREs, 
such  as  granulocyte-macrophage  colony  stimulating 
factor  (GM-CSF)  and  interleukin-3  (IL-3),  by  binding 
directly  to  the  AREs  of  these  messages  and  destabilizing 
them  in  some  way  (Carballo  et  al.,  1998). 

Regulation  of  mRNA  stability  is  of  critical  impor¬ 


tance  in  early  Xenopus  development,  partly  because  no 
zygotic  transcription  occurs  until  the  mid-blastula  trans¬ 
ition  (MBT),  several  hours  and  several  hundred  cell 
divisions  after  egg  fertilization.  Thus,  early  embryonic 
development  in  this  species  relies  on  the  presence  of 
maternal  mRNAs,  whose  stability  is  therefore  crucial  to 
this  process. 

For  these  reasons  and  because  of  the  relative  ease 
with  which  specific  proteins  can  be  introduced  and 
expressed  (Moon  and  Christian,  1989)  or  ‘knocked  out’ 
(Dash  et  al.,  1987;  Shuttleworth  and  Colman,  1988; 
Heasman  et  al.,  1994;  Wylie  et  al.,  1996;  Kofron  et  al., 
1997)  in  early  development,  we  attempted  to  isolate 
members  of  the  CCCH  class  of  zinc  finger  proteins  in 
Xenopus,  In  this  report,  we  describe  the  cloning  of  four 
members  of  this  class  in  this  organism,  three  of  which 
appear  to  be  homologues  of  the  three  previously  iden¬ 
tified  mammalian  proteins  of  the  double  CCCH  zinc 
finger  class.  We  also  isolated  a  previously  unidentified 
member  of  this  family,  which,  instead  of  the  usual  two 
CxsCxsCx^H  motifs,  contained  a  total  of  four  such 
motifs.  This  transcript  appeared  to  be  highly  expressed 
only  in  the  ovary,  oocyte,  egg,  and  early  embryo.  The 
other  three  mRNAs  were  also  maternally  expressed  to 
varying  degrees,  suggesting  potential  roles  for  all  four 
family  members  in  mRNA  stability  during  oocyte  matu¬ 
ration  and/or  early  embryogenesis. 


2.  Materials  and  methods 

2. 1,  Screening  the  Xenopus  kidney  library 

Reverse  transcriptase-polymerase  chain  reaction 
(RT-PCR)  was  performed  on  Xenopus  egg  poly-A"^ 
RNA  using  degenerate  primers  based  on  conserved 
regions  of  mammalian  CCCH  zinc  finger  motifs.  PCR 
products  were  subcloned  using  the  TA  cloning  kit 
(Invitrogen,  San  Diego,  CA),  then  cloned  into  the 
EcoRl  site  of  the  pCR2.1  vector  (Stratagene,  La  Jolla, 
CA).  One  of  the  PCR  products,  X50,  was  used  to  screen 
a  Xenopus  kidney  cDNA  library  (Stratagene).  Positive 
clones  were  grouped  according  to  the  sizes  of  their 
restriction  fragments,  and  the  similarities  of  fragments 
that  hybridized  to  the  original  PCR  probe.  Subclones 
were  sequenced  by  the  dideoxy-chain  termination 
method,  and  sequence  alignments  were  performed  using 
MacVector  6.0  computer  software  (Oxford  Molecular 
Group,  Campbell,  CA). 

2.2.  Northern  blotting 

These  were  performed  as  described  (Lai  et  al.,  1990), 
using  random-primed  ^^P-labeled  Xenopus  cDNA  frag¬ 
ments  as  probes.  Roughly  equivalent  loading  was  con¬ 
firmed  by  acridine  orange  staining  of  the  gels  prior  to 
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transfer.  Where  indicated,  mRNA  levels  on  northern 
blots  were  quantified  using  a  Phosphorimager 
(Molecular  Dynamics,  Sunnyvale,  CA). 

2.3.  In  situ  hybridization 

Antisense  and  sense  digoxigenin-labeled  RNA  probes 
representing  bp  1-245  and  245-565  of  XC3H-4  were 
used  for  in  situ  hybridization  histochemistry  as  described 
(Zeldin  et  al,  1997),  except  that  hybridization  and 
washes  were  performed  at  48°C,  and  the  colorimetric 
reaction  was  performed  without  10%  polyvinyl  alcohol. 


3.  Results 

3. 1.  Screening  the  Xenopus  kidney  cDNA  library 

Screening  of  a  Xenopus  kidney  cDNA  library  with 
the  189  bp  PCR  probe  yielded  152  positive  plaques.  62 
of  these  were  purified  and  subjected  to  restriction  map¬ 
ping  and  Southern  analysis.  Inserts  of  identical  size 
yielding  identical  restriction  fragments  that  hybridized 
to  the  probe  were  grouped  together.  One  clone  from 
each  of  15  groups  was  selected  for  sequencing,  and  in 
each  case  the  conserved  YKTELCX8CX5CX3H  motif  was 
identified  in  the  predicted  amino  acid  sequence.  The 
initial  15  cDNA  sequences  were  obtained  using  universal 
primers  on  the  complete  insert.  If  different  size  inserts 
shared  identical  sequences,  they  were  also  grouped 
together.  Next,  the  largest  cDNA  insert  from  each  of 
the  five  remaining  groups  was  subcloned  into 
pBluescribe  as  described  in  Section  2  and  sequenced, 
initially  using  universal  primers,  and  subsequently  with 
internal  primers.  Sequences  for  complete  open  reading 
frames  coding  for  four  distinct  proteins  were  obtained 
by  this  means,  and  are  described  below. 

3.2.  Cloning  and  characterization  of  XC3H-1,  a  TTP 
homologue 

We  have  adopted  a  simplified  nomenclature  for  these 
proteins,  with  X  referring  to  Xenopus,  and  C3H  referring 
to  the  CCCH  class  of  zinc  fingers.  XC3H-1  contained  a 
2756  bp  insert  that  included  a  single  open  reading  frame 
encoding  a  protein  of  313  amino  acids  with  a  predicted 
molecular  mass  of  34.8  kDa,  p/9.3.  This  insert  contained 
19  bp  of  5'  untranslated  region  (UTR)  and  1998  bp  of 
3'  UTR  (GenBank  accession  number  AF061980).  The 
predicted  amino  acid  sequence  contained  three  highly 
conserved  PPPPG-like  motifs;  this  fact,  along  with  an 
84%  amino  acid  identity  in  the  63  residue  region  contain¬ 
ing  the  two  CCCH  motifs,  and  an  overall  49%  amino 
acid  identity  to  human  TTP  (Taylor  et  al.,  1991), 
suggested  that  this  protein  was  the  Xenopus  homologue 
of  mammalian  TTP  (Fig.  1).  An  alignment  of  the 


XC3H-1  amino  acid  sequences  with  sequences  of  other 
mammalian  (mouse,  rat,  and  bovine)  TTP  homologues 
(Lai  et  al.,  1990;  Taylor  et  al.,  1995)  revealed  that 
XC3H-1  exhibited  a  48-50%  amino  acid  identity  with 
each  (data  not  shown).  In  addition,  serine  220  in  mouse 
TTP  (Lai  et  al.,  1990),  previously  identified  as  a  p42 
MAP  kinase  phosphorylation  site  (Taylor  et  al.,  1995), 
was  also  conserved  in  XC3H-1  as  serine  222. 

The  -4kb  XC3H-1  transcript  was  widely  expressed 
in  adult  Xenopus  tissues  (Fig.  2A);  its  expression  was 
detectable  in  the  oocyte  and  egg,  but  not  in  the  early 
embryo,  and  its  expression  did  not  become  prominent 
until  after  neural  tube  formation  in  embryonic  develop¬ 
ment  (Fig.  2B).  Its  expression  in  the  ovary  was  consider¬ 
ably  greater  than  that  of  the  oocyte  and  egg  per  amount 
of  RNA  (Fig.  2A  and  B),  suggesting  that  XC3H-1  might 
be  highly  expressed  in  the  ovarian  follicle  cells  or 
surrounding  interstitium. 


3.3.  Cloning  and  characterization  of  XC3H-2,  a  cMGI 
(TISllB,  ERF-1)  homologue 

XC3H-2  contained  a  2896  bp  insert  (GenBank  acces¬ 
sion  number  AF061981)  that  predicted  a  single  open 
reading  frame  encoding  a  protein  of  345  amino  acids, 
with  a  predicted  molecular  mass  of  38.3  kDa,  p/9.1. 
This  insert  contained  240  bp  of  5'  UTR  and  1621  bp  of 
y  UTR.  In  the  63  amino  acid  region  containing  the  two 
CCCH  repeats,  the  predicted  amino  acid  sequence  was 
98%  identical  to  that  of  rat  cMGl,  murine  TISllB,  and 
human  ERF-1  (Gomperts  et  al.,  1990;  Vamum  et  al., 
1991;  Barnard  et  al.,  1993);  the  overall  amino  acid 
identity  between  the  Xenopus  and  the  human  protein 
was  76%  (Fig.  3).  Northern  analysis  revealed  wide¬ 
spread  expression  of  one  transcript  of  about  2.7  kb  in 
adult  Xenopus  tissues  (Fig.  4A).  However,  during  the 
early  stages  of  embryonic  development,  XC3H-2  was 
expressed  as  multiple  transcripts  (-'3kb,  ^2.5  kb, 
-1.3  kb).  The  adult  -4  kb  transcript  was  expressed 
only  in  the  later  stages  of  development  following  the 
mid-blastula  transition  (MBT),  when  zygotic  transcrip¬ 
tion  is  initiated  (Fig.  4B).  As  discussed  in  more  detail 
below,  the  XC3H-4  mRNA  was  approximately  the  same 
size  as  the  —1.3  kb  transcript  that  hybridized  to  the 
XC3H-2  probe  present  in  the  egg  and  in  embryonic 
stages  7  and  9;  this  fact,  along  with  the  high  level 
expression  of  XC3H-4  in  the  ovary  and  early  embryo, 
suggested  the  possibility  that  the  1.3  kb  species  repre¬ 
sented  cross-hybridization  of  the  XC3H-2  probe  to 
XC3H-4.  However,  when  a  1.6  kb  Bam  HI  fragment  of 
XC3H-2  that  contained  most  of  the  3'  UTR  was  used 
as  a  probe,  it  also  hybridized  to  the  1 .3  kb  transcript 
(data  not  shown),  indicating  that  this  transcript  was  one 
of  the  multiple  transcripts  detected  in  the  oocyte,  egg 
and  early  embryo,  and  was  not  the  XC3H-4  transcript. 
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XC3H-1  1  MSSILDIHTLYQNLRNLDLSEDLDSP-REG  29 

hTTP  1  MDL  T  A  I  Y  E  S  L  -  -  L  S  LS  PDV  P  VP  S  DHG  24 

XC3H-1  30  K  L  L  S  T  Q  R  R  H  S  C  T  P  E  L  D  D  L  F  R  P  S  S  D  T  WN  Y  D  L  59 

hTTP  25  GTESSPGWGSSGPWS - LSPSDSSPSGVT  51 

XC3H-1  60  LRT  P  FR  S  DR  S  I  S  L  T  EG  S  RL  A  F  P  AP  P  PGF  PP  89 

hTTP  52  SRLPGR STS  L  V  EGR  S  C  G  W  V  P  P  P  P  G  F  A  P  78 

XC3H-1  90  L  K  T  AL  P  A  L  P  AP  S  P - RYKTEL  108 

hTTP  79  IL  A  P  RLG  P  E  L  S  P  S  P  T  S  P  T  AT  S  TT  P  S  R  YKTE  L  108 
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hTTP 
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GTITQDLLSTQ  ML  L R  S  PS  C  S  S  L  P - E  T  E 
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hTTP 
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T  P  i  S  -  VWG  P  L  G  G  L  V  jRT  P  S  VQ  S  L  G  S  D  P  D 
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XC3H-1 
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is - G  s  E  S  P  E  Q  S  YQ  SPPP - S 
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hTTP 
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ISSGSSLGGS  D'Sip.V  f  eagvfappqpvaa 

P  R  R 
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XC3H-1  303  -L  P  I  FNRLGVS  D  313 
hTTP  316  iLPlFKlRlS^SE  326 

Fig.  1.  Predicted  amino  acid  sequence  of  XC3H-1  aligned  with  human  TTP.  The  human  DNA  sequence  is  from  Taylor  et  al.  (1991)  (Genbank 
accession  number  M63625);  the  Xenopus  DNA  sequence  is  in  Genbank  (accession  number  AF061980).  The  CCCH  residues  in  the  two  zinc  finger 
repeats  are  delineated  by  boxes;  residues  that  are  identical  in  both  proteins  are  shaded  in  gray.  The  three  PPPPG  groups  in  the  mouse  sequence 
are  underlined.  Amino  acid  positions  are  numbered  on  both  sides.  The  alignment  was  performed  using  the  ClustalW  Alignment  program  in 
MacVector  6.0  computer  software  (Oxford  Molecular  Group,  Campbell,  CA) . 
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3.4.  Cloning  and  characterization  of  XC3H-3,  a  TISllD 
homologue 

XC3H-3  contained  a  1.6  kb  insert  that  predicted  an 
incomplete  open  reading  frame.  The  5'  end  of  this  coding 
sequence  was  sought  by  RT-PCR  on  Xenopus  egg  RNA, 
using  a  5'  internal  primer  derived  from  previous  5' 
RACE  of  X44  as  described  in  Section  2,  whose  partial 
sequence  was  identical  to  that  of  XC3H-3  and  shared 
significant  sequence  similarity  to  ERF-2  (TISllD) 
(Varnum  et  al.,  1991;  Nie  et  al.,  1995).  The  3'  internal 
primer  contained  a  unique  Sphl  enzyme  site  and  was 
about  205  bp  from  the  5'  end  of  XC3H-3.  A  full-length 
clone  was  generated  by  substituting  the  443  bp  of  PCR 


product  for  the  first  205  bp  BamHl/Sphl  fragment  of 
the  incomplete  clone. 

The  resulting  1825  bp  of  DNA  sequence  (GenBank 
accession  number  AF061982)  contained  an  open  reading 
frame  encoding  a  protein  of  364  amino  acids  with  a 
predicted  molecular  mass  of  40.3  kDa,  pi  9.3.  The  3' 
UTR  consisted  of  733  bp  and  ended  with  a  poly  A  tail. 
The  predicted  amino  acid  sequence  of  XC3H-3  resem¬ 
bled  that  of  the  third  known  mammalian  member  of  the 
family,  TISllD  (ERF-2)  (Varnum  et  al.,  1991;  Nie 
et  al.,  1995;  Phillips  and  Blackshear,  unpublished  data), 
with  97%  amino  acid  identity  over  the  63  residue  double 
zinc  finger  region  and  an  overall  71%  amino  acid  identity 
with  the  human  homologue,  ERF-2  (Fig.  5).  XC3H-3 
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Fig.  2.  Expression  of  XC3H-1  mRNA.  Northern  analysis  was  performed  on  total  RNA  (15  pg/lane)  prepared  from  (A)  adult  tissues  and  (B) 
embryos  at  different  stages  of  development.  The  positions  of  the  major  species  of  ribosomal  RNA  are  indicated.  Blots  were  probed  with  the  1.6  kb 
XC3H-1  cDNA.  The  ~4kb  XC3H-1  transcript  was  widely  expressed  in  the  adult.  It  was  detectable  in  the  oocyte  and  egg,  but  only  became 
detectable  in  the  embryo  at  stages  24  (tailbud)  and  43  (tadpole). 


was  more  similar  to  human  ERF-2  (Nie  et  al.,  1995) 
than  to  mouse  TISllD  (Vamum  et  al.,  1991)  at  both 
the  amino  and  carboxyl  termini.  The  loss  of  alignment 
between  the  carboxyl  terminal  sequences  of  TISllD 
(Vamum  et  al.,  1991)  and  ERF-2  (Nie  et  al.,  1995)  was 
noted  previously  (Nie  et  al.,  1995)  to  begin  12  amino 
!  acids  before  the  TISllD  stop  codon.  Inspection  of  the 
I  original  TISllD  nucleotide  sequence  (Genbank  acces- 
j  sion  number  M58564)  revealed  that  a  frameshift  at  this 
point  in  the  TISllD  sequence  would  have  resulted  in  a 
continuation  of  the  open  reading  frame  that  was  much 
more  similar  to  ERF-2  and  XC3H-3.  Similarly,  the  loss 
of  amino  terminal  homology  between  TISllD  and 
XC3H-3  on  the  one  hand  and  ERF-2  and  TISllD  on 
the  other  appears  to  be  due  to  the  fact  that  the  earlier 
initiator  methionine  used  by  both  ERF-2  and  XC3H-3 
was  not  present  in  the  cDNA  encoding  TIS 1 1 D  ( Vamum 
et  al.,  1991). 

XC3H-3  was  expressed  as  a  major  ^5  kb  transcript 
I  in  all  of  the  adult  Xenopus  tissues  analyzed  (Fig,  6A), 

j  and  it  was  readily  detectable  in  the  egg  and  again  after 

the  MET  (Fig.  6B).  The  decreases  in  size  and  amount 
of  the  transcript  in  embryonic  stages  9  and  13  most 
likely  represent  the  early  degradation  of  maternal  tran¬ 
scripts  before  zygotic  transcription.  In  addition,  the 
relatively  minor  --4  kb  transcript  in  the  egg  and  stage 
7  embryo  may  represent  a  degradation  product;  how¬ 
ever,  the  possibility  of  an  additional  minor  or  related 
I  transcript  cannot  be  ruled  out. 


3.5.  Cloning  and  characterization  of  a  novel  member  of 
the  vertebrate  CCCH family,  XC3H-4 

The  three  Xenopus  members  of  the  CCCH  protein 
family  described  thus  far  contained  two  zinc  finger 
motifs,  each  with  two  YKTELCxgCxsCxsH  motifs  in 
which  the  H  of  the  first  finger  was  separated  from  the 
first  C  of  the  second  finger  by  18  amino  acids.  These 
proteins  were  each  highly  homologous  to  one  of  the 
three  known  mammalian  CCCH  proteins  with  two  zinc 
fingers.  XC3H-4,  on  the  other  hand,  appeared  to  repre¬ 
sent  a  previously  unknown  vertebrate  family  member, 
with  two  additional  zinc  finger  motifs  and  a  very 
restricted  expression  pattern.  XC3H-4  contained  a 
1195  bp  insert  (GenBank  accession  number  AF061983) 
that  predicted  a  single  open  reading  frame  encoding  a 
protein  of  276  amino  acids  with  a  calculated  molecular 
mass  of  30  325  Da,  p/5.9  (Fig.  7A).  This  insert  also 
contained  38  bp  of  5^  UTR  and  309  bp  of  3'  UTR.  Aside 
from  the  zinc  finger  motifs,  this  protein  did  not  signifi¬ 
cantly  resemble  any  known  protein  in  current  databases, 
including  the  most  recent  releases  of  expressed  sequence 
tag  databases. 

The  y  UTR  contained  four  widely  spaced  AUUUA 
instability  motifs,  which  can  promote  rapid  turnover  of 
the  mRNA  (Shaw  and  Kamen,  1986).  Besides  the 
expected  two  highly  conserved  YKTELCX8CX5CX3H 
motifs,  XC3H-4  also  contained  two  additional  CCCH 
motifs  with  identical  internal  spacing  but  without  the 
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XC3H-2  1  M  S  T  A  L  I  S  P  T  I  F  D  L  S  D  VL  C  K  S  NKM  L  N  YN  N  N  1  30 

ERFl  1  M  T  T  T  L  V  S  AT  I  F  D  L  S  E  VL  C  K  G  NKM  LN  Y  S -  27 

XC3H-2  31  I  N  P  S  T  T  N  F  P  L  M  D  RK  A  VG  T  P  A  I  V  G  F  P  R  RH  S  V  60 
ERFl  28  -APSAG-GCLLDRKAVGTPAGGGFPRRHSV  55 

XC3H-2  61  T  L  P  N  A  K  F  N  QNQ  F  L  N  S  L  K  M  E  P  S  T  AM  G  N  K  E  N  K  90 

ERFl  56  TL  P  S  S  K  F  H  QNQ  L  L  S  S  L  KGE  P  A  P  AL  S  S  RD  S  R  85 

XC3H-2  91  FRDRSFSESGERLLQK - PGG-QVNSSRY  116 

ERFl  86  FRDRSFSEGGERLLPTQKQPGGGQVNSSRY  115 

XC3H-2  117  KT  E  L  C  R  P  F  E  EN  G  S  C  K  YG  DK  C  Q  F  AHG  I  H  E  L  R  146 
ERFl  116  KTEL[gRPFEENGAlc|KYGDK[c|QFA|H|GIHELR  145 

XC3H-2  147  SLTRHPKYRTELCRTFHTIGFCPYGPRCHF  176 
ERFl  146  S  L  T  RH  P  K  Y  K  T  E  l|c|r  T  F  H  T  I  G  F[cjp  Y  G  P  rIcJh  F  175 

XC3H-2  177  I  H  N  A  E  E  RR  L  V  S  G  R  D  Q  A  H  F  S  L  S  S  S  S  KM  E  R  P  R  2  06 

ERFl  176  IHNAEERRALAG - ARDLSAD - RPR  197 

XC3H-2  207  L  Q  H  S  F  S  FAGFPTTN - G  L  L  D  S  P  T  SIT  230 

ERFl  198  LQHS  F  S  F  AGF  P  S  AA  AT  A  AATGL  L  D  S  PT  S  I  T  227 

XC3H-2  231  PPPILSTDDLINSPTLHDCSTNPFTFSSQE  260 
ERFl  228  PPPILSADDLLGSPTLPDGTNNPFAFS'SQE  257 

XC3H-2  261  L  A  S  L  F  A  P  S  MG  M  Q  M  P  L  S  N  S  N  A  S  G  S  P  T  S  F  L  F  R  290  i: 

ERFl  258  LASLFAPSMGL  PG - GGS  PTTFLFR  280  | 

.  I 

XC3H-2  291  PMisESPQMFDSRPSPRDSLSDQEGYLSSSS  320  | 

ERFl  281  ;PM  S  E  S  P  HMFD  S  PJP  S:P  Q  D  S  L  S  DQ  E  G  YLS  S  S  S  310  | 

I 

XC3H-2  321  iS - G  S  D  S  PTLD  T  T  KRL  PI  F  S  R  L  S  I  S  DP  345  | 

ERFl  311  SSHSGSDSPTLDNSRRLPIFSRLSISDD  338  I 

R 

Fig.  3.  Predicted  amino  acid  sequence  of  XC3H-2  aligned  with  human  ERF-1.  The  human  DNA  sequence  is  from  Barnard  et  al.  (1993)  (GenBank  | 
accession  number  X71901 );  the  Xenopus  DNA  sequence  is  in  Genbank  (accession  number  AF061981 ).  All  other  symbols  are  as  in  the  legend  to  Fig.  1 .  | 

YKTEL  lead-in  sequence  (Fig.  7A).  These  two  fingers  which  these  cDNAs  were  selected  might  contain  contam-  I 

were  separated  by  only  seven  amino  acids  (Fig.  7A).  In  inating  oocyte  sequences.  XC3H-4  mRNA  was  very 

the  second,  more  carboxyl-terminal  pair  of  CCCH  highly  expressed  in  the  oocyte,  egg,  and  early  embryo 

repeats,  the  lead-in  sequence  was  PYRER  in  the  first  until  stage  10,  corresponding  to  the  end  of  MET,  but  it 

repeat  and  SARET  in  the  second  repeat  (Fig.  7B).  A  had  disappeared  by  stage  13  (Fig.  8B).  To  evaluate  the 

glutamic  acid  residue  was  conserved  in  the  fourth  posi-  expression  of  XC3H-4  between  stages  9  and  13  more  I 

tion  in  the  lead-in  sequence  from  all  four  putative  zinc  carefully,  embryos  at  stages  9-13  were  collected,  and  I 

fingers,  and  the  remaining  four  residues  were  either  RNA  from  these  embryos  was  analyzed  on  a  northern  I 

basic  or  neutral.  The  consensus  shared  by  the  four  blot  (Fig.  8C).  The  rapid  disappearance  of  the  mRNA  | 

CCCH  repeats  in  XC3H-4  was  therefore  after  stages  9-10  suggests  rapid  degradation  of  maternal  | 

ExCxgGxCxYxjCxFxH  (where  x  is  a  variable  amino  mRNA  without  new  zygotic  transcription.  || 

acid)  (Fig.  7B).  I; 

Northern  analysis  revealed  high-level  expression  of  3.6.  Localization  of  XC3H-4  mRNA  in  Xenopus  ovary  | 

the  1.2  kb  XC3H-4  transcript  in  the  ovary,  but  there  by  in  situ  hybridization 

was  no  detectable  expression  in  any  other  of  the  adult 

tissues  examined  (Fig.  8A).  No  detectable  expression  In  situ  hybridization  of  formalin-fixed,  paraffin- 

was  observed  in  the  kidney  by  this  means,  suggesting  embedded  Xenopus  ovary  sections  with  the  bp  1-245  i 

the  possibility  that  the  original  kidney  library  from  digoxigenin-labeled  antisense  mRNA  revealed  strong  | 
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Fig  4  Expression  of  XC3H-2  mRNA,  Northern  analysis  was  performed  on  total  RNA  (15  pg/lane)  prepared  from  (A)  adult  tissues  and  (B) 
embniofat  different  stages  of  development.  Blots  were  probed  with  the  0,5  kb  Bam  HI  XC3H-2  cDNA  fragment.  Multiple  transenp^.  ~3  kb 
L  kb,  and  1.3  kb  in  size,  were  present  in  the  ovary,  egg  and  early  embryo.  The  4  kb  transcript  that  is  widely  expressed  in  the  adult  was  first 
detectable  in  the  embryo  at  stage  13  (late  gastrula). 


positive  Staining  in  the  cytosol  of  oocytes  at  all  develop¬ 
mental  stages  examined  (Fig.  9).  There  was  no  detecta¬ 
ble  nuclear  staining,  and  no  apparent  staining  of  the 
ovarian  stromal  tissue.  A  similar  staining  pattern  was 
observed  with  the  bp  245-363  probe  (not  shown). 
Although  the  smaller  oocytes  appeared  to  be  uniformly 
stained  in  the  cytosol,  more  mature  oocytes  exhibited 
clear  animal  pole  localization  (Fig.  9). 


4.  Discussion 

By  screening  a  Xenopus  kidney  cDNA  library  with 
degenerate  PCR  probes  based  on  mammalian  sequences, 
we  have  cloned  a  new  vertebrate  member  of  the  double 
CCCH  class  of  zinc  finger  proteins,  in  addition  to  the 
apparent  Xenopus  homologues  of  TTP,  cMGl,  and 
TISllD.  The  unusual  properties  of  this  novel  protein, 
which  we  have  labeled  XC3H-4,  include  two  additional 
CCCH  zinc  finger  motifs  without  the  usual  YKTEL 
lead-in  sequence.  In  addition,  expression  of  its  mRNA 
appeared  to  be  limited  to  oocytes,  eggs,  early  embryos 
(before  stage  10)  and  adult  ovary;  in  all  these  tissues, 
its  transcripts  appeared  to  be  quite  abundant.  These 
transcripts  were  present  in  oocytes  of  all  stages,  and 
were  localized  to  the  animal  pole  in  mature  oocytes. 

Unlike  XC3H-1,  XC3H-2,  and  XC3H-3,  which  all 
have  basic  p/  values,  XC3H-4  is  an  acidic  protein  with 
a  p/  of  5.9;  secondary  structure  analysis  suggests  that  it 
has  no  notable  hydrophobic  domains  characteristic  of 
integral  membrane  proteins.  Proline,  serine  and  leucine 
each  comprise  12-13%  of  the  amino  acid  composition. 
Apart  from  its  four  CCCH  motifs,  XC3H-4  does  not 
appear  to  be  related  to  any  sequences  in  currently 


available  databases;  specifically,  no  mammalian  counter¬ 
parts  of  this  protein  were  found  in  current  databases. 
Its  function  remains  unknown;  however,  it  may  be 
possible  to  explore  the  function  of  XC3H-4  during 
oogenesis  and/or  early  development  by  the  use  of  anti- 
sense  oligonucleotide  and  host  transfer  techniques 
(Wylie  et  al.,  1996). 

Recently,  our  group  has  established  a  possible  func¬ 
tion  for  the  subclass  of  CCCH  proteins  to  which  all 
four  of  the  cloned  Xenopus  proteins  belong.  TTP,  the 
prototype  of  proteins  of  this  class,  was  shown  to  be 
necessary  for  the  normal  lability  of  TNFa  mRNA  in 
mouse  macrophages  (Carballo  et  al.,  1998).  When  TTP 
is  absent,  as  in  TTP-knockout  mice,  this  abnormal 
stabilization  of  the  TNFa  mRNA  results  in  hypersecre¬ 
tion  of  TNFa  from  macrophages,  and  a  chronic 
inflammatory  syndrome  in  intact  mice  (Taylor  et  al., 
1996a, b;  Carballo  et  al„  1997,  1998).  TTP  has  been 
shown  to  bind  directly  to  the  AU-rich  element  in  the 
TNFa  mRNA  3'  UTR  (Carballo  et  al.,  1998);  this 
binding  in  some  way  results  in  destabilization  of  the 
mRNA  (Carballo  et  al.,  1998),  probably  by  initiating 
its  deadenylation  (Lai  et  al.,  submitted).  These  proper¬ 
ties  of  mammalian  TTP,  i.e.  direct  binding  to  the  AREs 
of  TNFa  and  other  cytokines  and  the  ability  to  destabi¬ 
lize  TNFa  mRNA  in  co-transfection  studies,  were  shared 
by  another  mammalian  family  member,  cMGl,  and 
were  also  shared  by  XC3H-1  and  XC3H-3  (Lai  et  al., 
submitted).  There  was  no  obvious  response  to  XC3H-2 
in  these  studies,  perhaps  because  of  its  relatively  poor 
expression  in  the  293  cell  transfection  experiments  (Lai 
et  al.,  submitted).  XC3H-4  was  also  ineffective  in  these 
assays,  perhaps  for  the  same  reason,  or  perhaps  because 
it  differs  from  the  three  known  mammalian  proteins  in 
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XC3H-3  1  MS  AT  1.  L  S  A  F  Y  D  I  D  L  L  Y  KN  E  K  AL  NNL  AL  S  TM  30 
ERF2  1  MSTTLL-SAFYOVPFLCKTEKSLanLnLNNM  30 

XC3H-3  31  I  DKK  A  VG  S  ^  V  S  S  T  N:SN  -  L  Ff  iG;fL  ]RillI;S  A  SN  59 

ERF2  31  L  DKK  Ay  GTP  V  A  A  A  P  S  S  G  F  A’f  EAiSN  60 

XC3H-3  60  L  Q  AL  S  G  S  T  N  P  AK  F  C  H  NNNNN  Q  L  M -  82 

ERF2  61  LH  AL  A  H  P  A  P  S  P  G  S  G  S  P  K  F  P  G  A  ANG  S  S  C  G  S  A  90 

XC3H-3  83  - E  SA  A  S  S  T  AL  LN  R  ENKF  R  99 

ERF2  91  A  A  GG  P  T  S  Y  G  T  L  K  E  P  S  G  G  G  G  TALLN  K  ENKF  R  120 

XC3H-3  100  DRSFSENGERSQHLLHLQQQQQQQKAGAHV  129 
ERF2  121  ®  R  SiF  S  E«  G  D  R  S  QH  L  L  Q  K  G  G  -  G  G  S  Q  I  149 

XC3H-3  130  Ms  TRYKTELGRPFf  ENGAGE  YGEKCQFAHG  159 
ERF2  150  MsTRYKi’EL6RPR|ESGT&KYGEKGQEAHG  179 


XG3H-3  160 
ERF2  180 


F  H  E  L  R  S  L  T  R  H  p;  K  Y  K  T  ELCRTFHTIGFGP  Y  G  189 
:F  H  E  L  R  S  L  T  R  H  Pi  K  Y  A  T;  E  L  C  R  T  F  H  T  I  G  F  G  P  Y  G  209 


C  C  Q  F  A  H  G 
G£QEAHC 

C  G  F  G  P  Y  G 
C  G  F  G  P  YG 


XG3H-3  190  FRGHFIHNAEERRQAPGAGE - 209 

ERF2  210  PRCHFIHNADFRRPAPSgMaSGDLRAFGTR  239 


XG3H-3  210 - RPKLHHSLSFSGFPNHSLD  228 

ERF2  240  DALHLGFPREPRPKLHHSLSFSGFPSGHHQ  269 

XG3H-3  229  S;P - ^L  L  E  S  P  T  S  R  T  P  P  P  Q - 243 

ERF2  270  p!p  GG  L  E  S  P  L  L  L  DE  PT  S  RT  P  P;P  P  S  G  S  S  A  S  S  G  299 

XG3H-3  244  ^  S  G  S  L  y£  Q  E  L  L  Q  L  N  N  N  N - 260 

ERF2  300  SSSASSSSSASAASTPSGTPTGGASAAAAL  329 

XG3H-3  261  - P - CANN  265 

ERF2  330  RL  L  Y  G  T  G  G  A  E  D  L  L  A;p:G  A  P  G  A  AG  S  S  A  St  ANN  359 

XG3H-3  266  .AFTFSGQELGLIAPLA-l'HtQM - Q  S  I;  289 

ERF2  360  AFARGPELSSL'IT'pLAIQTHNFAAVAAAAi  389 

XG3H-3  290  g!r - QPGS - 295 

ERF2  390  yJR  S  Q  Q  Q  Q  Q  Q  Q  G  L  A  P  P  AQ^P  P  A  P  P  S  A  T  L  P  AG  A  419 

XG3H-3  296  •E'pPLSFQPLRRVSESPVFDAPPSPp'd  321 

ERF2  420  A  A  P  P’SPP  fM  FQ  L  P'RR  LED’S  P  VFfiAP  P  S”P  PD  449 

XG3H-3  322  fS  A  S  D  R  DeM^L  EG  S  L  S  S  G  S  L  S  G  S  351 

ERF2  450  SLSDRDSYLSGSLSSGSLSGSESPSLDPGR  479 


XG3H-3  352  :R  L  P  I  F  S  R  L  S  I  S  D  D  364 
ERF2  480  RLPIFSRLS  ISDD  492 

Fig.  5.  Predicted  amino  acid  sequence  of  XG3H-3  aligned  with  human  ERF-2.  The  human  DNA  sequence  is  from  Nie  et  al.  (1995)  (GenBank 
accession  number  X78992).  The  Xenopus  DNA  sequence  is  in  Genbank  (accession  number  AF061982).  All  other  symbols  are  as  in  the  legend 
to  Fig.  1. 
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Fig.  6.  Expression  of  XC3H-3  mRNA.  Northern  analysis  was  performed  on  total  RNA  (15  pg/lane)  prepared  from  (A)  adult  tissues,  oocytes,  and 
eggs  and  (B)  embryos  at  different  stages  of  development.  Blots  were  probed  with  a  (A)  0.5  kb  EcoRl/Ncol  XC3H-3  cDNA  and  (B)  0.8  kb  Sstl- 
Hindlll  XC3H-3  cDNA.  The  ~4.4  kb  transcript  was  expressed  in  all  adult  tissues  tested,  as  well  as  oocytes  and  eggs;  it  was  also  expressed  in  the 
embryo  by  stage  19. 


several  ways,  as  noted  above.  However,  it  seems  likely 
that  at  least  three,  and  possibly  all  four,  CCCH  proteins 
from  Xenopus  will  turn  out  to  be  ARE-binding  and 
mRNA  destabilizing  proteins.  Since  control  of  poly  A 
tail  length  is  critical  in  early  development  in  Xenopus 
and  other  species  (Beelman  and  Parker,  1995;  Audic 
et  al.,  1997;  Paillard  et  al.,  1998),  it  seems  likely  that 
one  of  the  functions  of  this  class  of  proteins  in  Xenopus 
is  to  regulate  mRNA  turnover  at  critical  times  during 
oocyte  maturation  or  early  embryo  development.  The 
specific  mRNA  targets  of  each  of  these  proteins  will  be 
the  subject  of  further  study. 

In  situ  hybridization  of  Xenopus  ovary  using  digoxi- 
genin-labeled  antisense  RNA  probes  for  XC3H-4 


revealed  that  the  smallest  oocytes  were  strikingly  and 
uniformly  labeled  in  the  cytosol  but  not  the  nucleus. 
More  mature  oocytes  showed  strong  positive  cyto¬ 
plasmic  staining  in  the  animal  pole,  including  the  cortex 
and  the  perinuclear  region.  Only  a  small  fraction  of 
Xenopus  maternal  RNAs  are  thought  to  be  localized  in 
this  way  (Rebagliati  et  al.,  1985).  The  mechanisms  of 
this  localization  of  XC3H-4  to  the  animal  cortex,  and 
its  developmental  significance,  will  require  further  study. 

To  date,  the  three  known  vertebrate  double 
Cx8Cx5Ca:3H  proteins  have  contained  two  very  similar 
CCCH  zinc  fingers,  with  the  carboxyl-terminal  H  of  the 
first  finger  separated  from  the  amino  terminal  C  of  the 
second  finger  by  18  amino  acids  in  all  three  cases.  This 


A 

1  MEISNDSLDL  FSSFFPQLSP  PADPETPLLP  SFSAPPKHLS  LSSLRYKTEL 

GLSELRPPVQ  HPKYKTElj^MHlM^PfflliSiSP  QERREPPVLP 
121  DNLSLPPRRY  GGPYRERH^ME»^MH»iPKS  ARETlBSilMi^^ 
3-81  i^^SPPLDR  WGSGTKNSSG  SLSPSDPDSD  PDTPVLSESP  ANNAFSFSSL  LLPLALRLQI 
241  LGDDDLPTAS  DPLPGDDTDL  LPGDEEIAQG  LLSVLG 


B 

Zf#l  46 
Zf#2  84 
Zf#3  133 
Zf#4  160 


Y  K  T 

Y  K  T 
P  Y  R 
S  A  R 


R  Y  A  E  S 
S  F  H  V  L 
L  W  S  A  P 
H  F  A  A  L 


Fig.  7.  Predicted  amino  acid  sequence  of  XC3H-4.  The  DNA  sequence  has  been  deposited  in  GenBank  (accession  number  AF061983).  (A)  The 
predicted  amino  acid  sequence  encoded  by  XC3H-4  cDNA  is  shown  here.  The  four  putative  zinc  finger  motifs  are  shaded  in  gray;  the  CCCH 
residues  of  the  four  zinc  finger  repeats  are  delineated  by  bold  letters.  Amino  acid  positions  are  numbered  on  the  left.  (B)  Alignment  of  the  four 
zinc  finger  motifs  in  XC3H-4,  from  the  most  amino  terminal  (1 )  to  the  most  carboxy  terminal  (4).  All  other  symbols  are  as  in  the  legend  to  Fig.  1. 
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Fig  8.  Expression  of  XC3H-4  mRNA.  Northern  analysis  was  performed  on  total  RNA  prepared  from  (A)  adult  tissues  (15  pg/lane)  (B) 
at  different  stages  in  development  (15  pg/lane),  and  (C)  embryos  from  stages  9-13  (7  pg/lane).  Blots  were  Proted  with  the  0.8  kb  XC3H4 
cDNA.  Expression  of  the  ~  1.2  kb  transcript  was  limited  to  the  ovary,  egg  and  early  embryo  until  stage  10  (late  blastula);  there  was  no  detectable 
expression  after  the  mid-blastula  transition. 


spacing  was  exactly  the  same  in  the  Xenopus  homo- 
logues;  in  fact,  14  of  the  18  intervening  amino  acids 
from  XC3H-1  were  identical  to  those  of  human  TTP 
(Taylor  et  al.,  1991),  whereas  all  18  of  the  residues  were 
identical  when  comparing  XC3H-2  with  ERF-1 
(Barnard  et  al.,  1993)  and  XC3H-3  to  ERF-2  (Nie 
et  al.,  1995).  In  contrast,  although  the  amino-terminal 
two  fingers  of  XC3H-4  were  separated  by  the  usual  18 
amino  acids,  the  carboxyl-terminal  pair  of  putative  zinc 
fingers  was  separated  by  only  seven  amino  acids.  This 
difference  between  the  two  pairs  of  zinc  fingers,  the 
different  lead-in  sequences  in  the  carboxyl  terminal  pair. 


and  the  overall  p/,  suggest  that  XC3H-4  is  the  prototype  | 
of  a  new  subclass  of  CCCH  zinc  finger  proteins.  I 

Recently,  a  CCCH  bovine  protein  containing  four  | 
zinc  fingers  in  the  Cx^Cx^Cx^R  configuration  and  one  | 
zinc  finger  in  the  Cx8Cx4Cx:3H  form,  in  addition  to  a  | 
CX2CX4HX4C  zinc  knuckle  motif,  was  identified  as  the  f 
30  kDa  subunit  of  cleavage  and  polyadenylation  speci- 1 
ficity  factor  (CPSF)  (Barabino  et  al.,  1997).  None  of  |. 
the  five  putative  zinc  fingers  had  classical  TTP  j 
CxgCxsCxjH  spacing,  and  none  had  the  typical  YKTEL  ; 
lead-in  sequence  (Barabino  et  al.,  1997).  CPSF  30K  was  | 
shown  to  be  essential  for  3'  end  processing  of  pre- 1 
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Fig.  9.  In  situ  hybridization  histochemistry  of  XC3H-4  mRNA  expression  in  Xenopus  ovary.  In  situ  hybridization  analysis  was  performed  using 
antisense  (A,  C,  D,  and  E)  and  sense  (B)  digoxigenin-labeled  XC3H-4  mRNA  probes  on  formalin-fixed,  paraffin-embedded  Xenopus  ovary  sections. 
Blue  color  represents  the  hybridization  signal.  A  and  B  are  neighboring  sections  of  the  same  oocyte.  A,  animal  pole;  V,  vegetal  pole;  N,  nucleus; 
O,  ovary  stromal  tissue.  The  arrowheads  in  C-E  point  to  developing  oocytes.  All  photomicrographs  were  taken  at  the  same  magnification;  the 
bar  =  100  pm. 


mRNAs,  Although  deletion  of  the  zinc  knuckle  motif 
resulted  in  dramatic  decreases  in  the  protein’s  ability  to 
interact  with  RNA,  low  affinity  for  poly  (U)  was  still 
maintained.  These  findings  support  a  possible  role  for 
the  CCCH  zinc  fingers  in  binding  RNA.  The  mouse 
homologue  of  the  30  kDa  subunit  of  CPSF  (GenBank 
accession  number  U96448)  contained  similar  CCCH 
domains  but  lacked  the  zinc  knuckle. 

Based  on  its  49%  amino  acid  identity  and  expression 
pattern,  XC3H-1  appears  to  be  the  Xenopus  homologue 
of  mammalian  TTP.  Like  mammalian  TTP,  XC3H-1  is 
a  basic  protein  with  a  p/  of  9.3.  One  of  the  three  PPPPG 
repeats  that  are  present  in  all  mammalian  TTPs  is 
perfectly  conserved  in  XC3H-1,  while  the  remaining  two 
repeats  are  partially  conserved.  In  addition  to  these 
conserved  repeats,  the  serine  residue  that  is  a  major  site 
of  MAP  kinase  phosphorylation,  serine  220  in  the  mouse 
TTP  sequence  (Taylor  et  al.,  1995),  is  also  conserved  in 
XC3H-1,  as  serine  222,  as  is  the  following  proline.  The 
apparent  lack  of  XC3H-1  expression  in  early  Xenopus 
development  makes  it  feasible  to  perform  overexpression 
studies  in  early  embryos  as  an  approach  to  function. 
Although  the  mammalian  protein  has  been  implicated 
in  the  regulation  of  TNFa  release  from  macrophages 


(Taylor  et  al.,  1996a, b;  Carballo  et  al.,  1997,  1998),  and 
both  TTP  and  XC3H-1  bind  to  mammalian  TNFa  and 
GM-CSF  AREs  and  destabilize  TNFa  mRNA  (Lai 
et  al.,  submitted),  the  physiologically  relevant  binding 
partner  for  XC3H-1  in  Xenopus  remains  unknown. 

XC3H-2,  the  Xenopus  homologue  of  ERF-1,  is  unu¬ 
sual  in  that  transcripts  of  multiple  sizes  are  present 
during  early  development.  This  finding  has  some  prece¬ 
dent  in  the  literature.  For  example,  multiple  DTISll 
transcripts  were  expressed  during  Drosophila  embryo- 
genesis,  with  a  predominant  3  kb  species  expressed 
during  early  embryonic  development,  and  a  6  kb  species 
that  was  detected  in  later  embryonic  stages  (Ma  et  al., 
1994).  Similarly,  alternative  splicing  in  the  zinc  finger 
region  of  the  wtl  gene  resulted  in  the  major  transcript 
having  three  extra  amino  acids  between  the  third  and 
fourth  zinc  fingers.  The  major  and  minor  forms  then 
had  distinct  DNA  binding  specificities  (Bickmore  et  al., 
1992).  Thus,  certain  zinc  finger  proteins  are  subject  to 
post-transcriptional  regulation  that  results  in  the  synthe¬ 
sis  of  multiple  potential  regulatory  proteins  from  a  single 
transcriptional  unit.  Future  studies  will  attempt  to  deter¬ 
mine  whether  the  observed  multiple  transcripts  of 
XC3H-2  are  derived  from  developmentally  controlled 
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differential  splicing,  alternative  transcription  start  sites, 
or  a  shortening  of  maternal  mRNA. 

The  study  of  zinc  finger  proteins  in  Xenopus  has  been 
focused  thus  far  largely  on  the  C2H2  proteins,  which 
represent  a  large  multigene  family  whose  members  are 
maternally  transcribed  and  widely  expressed  in  eggs  and 
embryos  (Koster  et  al.,  1988;  Knochel  et  al.,  1989). 
Oligonucleotide-directed  destruction  of  the  ‘entire  poof 
of  maternal  mRNAs  encoding  C2H2  zinc  finger  proteins 
resulted  in  no  notable  change  in  normal  oocyte  matura¬ 
tion  and  embryogenesis  (el-Baradi  et  al.,  1991). 
However,  the  authors  admit  to  the  possibility  of  incom¬ 
plete  mRNA  destruction  in  their  experiments.  Our 
studies  have  identified  four  zinc  finger  proteins  of  the 
CCCH  class  in  Xenopus  that  are  all  represented  in  the 
maternal  pool  of  mRNA.  Oligonucleotide-mediated 
selective  destruction  of  transcripts  encoding  each  of 
these  proteins  may  help  to  elucidate  their  potential  role 
in  oocyte  maturation  and/or  embryogenesis  in  Xenopus, 
and  perhaps  suggest  similar  roles  in  early  development 
for  their  mammalian  counterparts. 
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recognize  the  consensus  DNA  sequent  .  „^„esis  by  MyoD-related  bHLH  proteins 

distinct  internal  and  flanldng  tase^  In  r^idues  that  are  not  essential  for  binding  to 

depends  on  myogenic  basic  r^onCBR)  and  ^  interactions.  We  have 

a  muscle-specific  site,  implying  that  to  .  _  sequence  recognition  and  how  MyoD,  Tiwst,  and 

investigated  whether  the  ^  p^G  sites  (the  E-hox  consensus  sequence  is  undertoed).  In 

their  E2A  partner  protems  prefer  distinct  CAN  nJrticular  CAN  NTG  sites  indirectly,  by  influencii^ 

MyoD,  the  my^enic  BR  j^idu^  f^XhSSirS  binding  by  BR  and  junction 

the  conformation  through  which  *be  b  ,  Is  necessary  but  not  sufficient  for  myogenesis, 

mutants  suggests  that  an  appropriate  j.  important.  The  sequence  specificities  of 

supporting  the  model  that  additional  residues  In  addition,  mechanisms  that  position  toe  BR 

E2A  and  Twist  proteins  require  toe  or  Twist,  indicating  that  the  E2A  BR  can 

allow  E2A  to  prefer  distinct  half-sites  as  a  bete^^jr  J  findings  indicate  that  E2A  and 

be  directed  toward  different  tergete  atopting  pScular  preferred  BR-DNA  conformations, 

SKteh  Wv^S^-cS  g^Se-r^a  t^caa  be  important  for  fiinctional  specificity. 


A  large  family  of  transcriptional  regulators  is  defined  by  toe 
basic  hdix-loop-helix  (bHLH)  motif  (40),  in  which  a  DNA- 
binding  basic  region  (BR)  lies  immediately  ammo  terminal  to 
the  HLH  dimerization  segment  (17,  41,  55).  In 
bHLH  proteins  are  involved  in  specification  of  nmltiple  cell 
types  (33,  43,  56).  Some  bHLH  family  members  fundion  as 
hLodimers,  but  others  appear  to  act  together  with  a  het- 
erodimeric  partner  (56).  For  example,  the  closely  relat^ 
bHLH  proteins  that  mediate  myogenic  differentiation,  ®clu  - 
ing  MyoD,  are  thought  to  function  as  heterotomers  with  E 
a  widely  e:Tre,,«l  bIM  pMem 
exemplified  by  the  E2A  proteins  (14, 17.  ^2,^.  M^t 
protein  dimers  bind  to  the  consensus  NTfi  (toe  E  boiq 
to  ease  of  identification,  the  consensus  sequent  is  underlined 
throughout  the  text)  with  each  respective  BR  bmding  to  a  b^ 
site  (9  19-21, 35, 45, 48).  Given  toe  many  regulatory  procesres 
in  which  bHLH  proteins  are  involved,  the  apparent  simplicity 
of  toe  CAN  NTG  consensus  raises  toe  important  question  ot 
how  different  bHLH  proteins  act  only  on  appropriate  target 

®®irpS?  toe  specificity  with  which  bHLH  proteins  function 
derives  from  preferential  recognition  of  different  classe^f 
CAN  NTG  sites  by  different  bHLH  protem  subgroups.  Tfre 
HLH  segment  consists  of  a  parall^  * 
tomdle  (^.  1)  (19-21, 35, 45,  48).  The  BR  is  unstructured  m 
solution  (2)  but  when  bound  to  DNA,  it  extends  N  termina  y 
from  toe  HLH  segment  as  an  a  helix  that  crosses  toe  major 
groove  (Fig.  1).  Crystallographic  analyses  have  revealed  roine 
^eren^  in  how  these  proteins  bind  DNA.  For  ei^ple,  m 
Myc  family  and  related  bHLH  proteins,  ^  arginine  (  g) 
r  Jidue  at  BR  position  13  (Fig.  2)  specifies  recognition  of 
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CACGIG  sites  (7, 16, 25, 54)  by  contacting  bfes  in  the^tw 
f20  21  48).  However,  it  still  is  not  understood  how  bHLH 
proteins  which  have  a  different  amino  acid  ^R  Festoon  13 
ffig.  2)  bind  preferentially  to  distmcff  C^  sites  (9, 16) 
or  how  bHLH  proteins  establish  differences  m  flantang  se¬ 
quence  selectivity  (9,  23,  24)  that  can  be  of  biological  impor- 

**I^y’binJl  proteins  that  lack  R13,  including  and 

other  E2A  partners  (Fig.  2),  can  bind  to  smiilar  ^NA  se¬ 
quences  in  vitro  but  act  on  different  teue-spectoc  genes  (56). 
Qwperative  or  inhibitory  relationships  with  other  tta^p- 
tional  regulators  might  contribute  to  this  specificity  (34, 39, 4  , 
58)  but  it  is  not  likely  to  derive  entirely  from  other  lineage- 
specific  factors,  because  MyoD  can  induce  myopnesis  m  many 
Afferent  cell  types  (56).  Initiation  of  myogenesis  by  MyoD  Md 
other  myogem^c  bHLH  proteins  depends  on  toee  residues  that 
are  located  within  the  BR  and  the  BR-HLH  juri^on  (A5,  Tg, 
and  K«  [Fig.  1  and  2]).  These  myogenic  residues  ^e  not 
“sentiS  for  binding  a  muscle-specific  site  in  vitro  or  injivo 
suggesting  that  they  are  involved  m  other  oitical  ‘"te^om 

ni  17  18  47  57).  These  interactions  have  been  proposed  to 

Lolve  distort  cofertors  (11, 17, 57)  and  toe  unmaking 
activation  domain  in  MyoD  or  toe  myogemc  rofeitor  MEK 
(3  5  29  57)  In  tlie  MyoD-DNA  structure,  K15  is  onented 

Uy'tomO»DNA,b«tA,»riT,b»themaj«8roo^ 

could  not  contact  other  proteins  directly  (35)  (Fig.  1).  How 
ever  the  latter  two  residues  could  influence  protein-protein 
interactions  indirectly,  by  affecting  how  toe  p 
tioned  on  the  DNA  (35).  Although  “bstitutions  at  to^ 
sitions  might  not  substantiaUy  impau  binding  ^ 

CAN  NTG  sites,  it  is  important  to  determine  whether  they 
to^ht  h^  more  subtle  influences  on  sequence  specificity  that 

could  reflect  conformational  effects.  a  „at 

We  have  determined  that  the  myogenic  residues  As  and  l^ 
establish  the  characteristic  MyoD 

includes  a  CAGCTG  core.  Individual  substitutions  at  these  BR 
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FIG.  1.  A  MyoD-DNA  complex.  In  this  X-ray  crystallographic  structure  (35), 
a  MyoD  homodimer  is  bound  to  the  sequence  AACAGCTGTT,  which  corre¬ 
sponds  to  its  preferred  recognition  consensus  (9).  Residues  are  numbered  as  in 
full-length  MyoD,  and  their  positions  as  spewed  in  Fig.  2  and  the  text  are 
indicated  in  parentheses.  Binding  site  positions  ±5  (numbered  as  in  Rg.  3 A)  are 
indicated  by  grey  numerals.  Side  chains  are  shown  only  for  the  myogenic  residues 
(green)  (18)  and  Arg  111  (Rj)  (gold). 


positions  simultaneously  alter  preferences  for  multiple  bases 
that  MyoD  does  not  contact  directly  (35),  indicating  that  these 
preferences  are  determined  indirertly,  by  how  the  BR  helix  is 
positioned  on  the  DNA.  This  mechanism  is  distinct  from  the 
standard  model  for  sequence  specificity,  in  which  preferred 
bases  are  contacted  directly  (44,  50).  The  corresponding  BR 
residues  are  also  required  for  the  sequence  preferences  of  E2A 
proteins,  which  can  recognize  either  of  two  distinct  half-sites 
depending  on  their  dimerization  partner.  E2A  homodimers 
and  E2A-MyoD  heterodimers  bind  to  asymmetric  sites  that 
include  a  CACCTG  core.  In  contrast,  as  a  heterodimer  with 
the  bHLH  protein  Twist,  E2A  binds  preferentialfy  to  half  of 
the  symmetric  sequence  CATATG.  The  preference  of  E2A  for 
the  former  asymmetric  sites  depends  not  onfy  on  the  BR  se¬ 
quence  but  ai^  on  BR  positioning  that  involves  the  junction 
region.  An  analysis  of  DNA  binding  by  MyoD  and  E2A  junc¬ 
tion  and  BR  mutants  indicates  that  a  MyoD-like  sequence 
specificity  is  associated  with,  but  not  sufiBcient  for,  myogenesis. 
This  supports  the  model  that  the  BR  junction  region  is  also 
involved  in  other  critical  interactions.  The  results  suggest  that 
E2A  and  its  partner  bHLH  proteins  bind  DNA  by  adopting  a 
limited  number  of  preferred  BR  conformations,  each  of  which 
is  associated  with  a  characteristic  DNA  sequence  preference. 
They  also  indicate  that  binding  of  cofactors  to  the  MyoD  BR 
can  be  influenced  by  how  it  is  positioned  on  the  DNA  and  are 
consistent  with  the  idea  that  relatively  subtle  differences  in 
binding  sequence  recognition  can  modulate  bHLH  protein  ac¬ 
tivity  in  vivo. 
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MATERIALS  AND  METHODS 

Mutagenesis,  protein  expression,  and  DNA  binding  assays.  The  various  MyoD 
and  E2A  mutants  used  in  this  study  have  been  described  previously  (17, 18, 57), 
with  the  exception  of  the  MyoD  mutants  shown  in  Fig,  8A  For  construction  of 
those  mutants,  a  Sail  site  that  did  not  alter  the  encoded  amino  acid  sequence  was 
created  at  MyoD  BR  positions  10  and  11  (Fig.  2)  by  PCR.  BR  mutants  were  then 
generated  by  PCR  using  Pfu  or  Vent  polymerase  and  introduced  into  this  MyoD 
(5a/I)  construct  as  Pmli-Sall  fragments.  Junction  region  mutations  were  created 
similarly  by  PCR  and  inserted  into  MyoD  (Sa/I)  as  SaU-Narl  fragments.  Con¬ 
structs  with  both  BR  and  junction  mutat  jons  were  produced  by  introduction  of  a 
mutant  Pmll-Sall  or  SaU-Narl  fragment  into  the  appropriate  BR  or  junction 
mutant  construct.  All  of  these  mutations  were  confirmed  by  DNA  sequencing. 

For  the  in  vitro  selection  experiment  shown  in  Fig.  3,  full-length  MyoD  was 
expressed  in  bacteria  from  a  pRK171a-based  construct  (pT7-MyoD)  described 
previously  (53).  The  MD(E12B),  MD(E12B-A),  and  MD(E12B-AT)  mutations 
(57)  were  eadi  introduced  into  this  construct  within  a  PmR-Mlul  fragment. 
These  proteins  were  expressed  by  isopropyl-p-D-thiogalactop^anoside  induction 
in  Escherichia  coU  BL21(DE3)/pLysS  cells  as  described  previously  (51)  and  then 
purified  to  >90%  homogeneity  by  precipitation  in  0.6  M  (NH4)2S04.  Precipi¬ 
tated  protein  was  resuspended  in  a  mixture  containing  10%  glycerol,  20  mM 
HEPES  (pH  7.6),  100  mM  NaQ,  1  mM  EDTA,  1  mM  dithiothreitol,  1  mM 
phenylmethylsulfonyl  fluoride,  1  p.g  of  leupeptin  per  ml,  and  1  pg  of  pepstatin 
per  ml. 

Other  proteins  were  expressed  by  in  vitro  translation  (Promega),  with  in  vitro 
transcription  and  translation  performed  in  separate  steps.  Protein  expression  was 
carefully  quantitated  by  ^^S-labeled  translation  and  sodium  dodecyl  sulfate- 
polyacrylamide  gel  electrophoresis.  These  procedures  and  those  for  electro¬ 
phoretic  mobility  shift  assay  (EMSA)  have  b^n  described  previously  (31).  Each 
EMSA  was  performed  at  room  temperature  and  analyzed  by  autoradiography  or 
phosphorimaging.  Individual  oligonucleotide  sites  were  21  bp  in  length  and 
differed  from  the  MyoD  consensus  oligonucleotide  (7)  only  at  the  positions 
indicated.  The  MCK-R  site  corresponds  to  the  right  E-box  site  in  the  muscle 
creatine  kinase  enhancer  (18). 

In  vitro  selection  experiments.  Populations  of  preferred  binding  sites  were 
isolated  by  sequential  in  vitro  selection  and  PCR  amplification  essentially  as 
described  previously  (6, 7, 9).  During  each  selection  round,  DNA  that  was  bound 
by  the  protein  complex  of  interest  was  isolated  by  EMSA  and  then  amplified  by 
PCR  for  the  next  round.  In  each  EMSA  selection,  care  was  taken  to  ensure  that 
sufficient  quantities  of  labeled  bound  DNA  were  recovered  to  maintain  a  rep¬ 
resentative  population  of  sequences.  These  experiments  were  initiated  with  0.5 
ng  of  ^^P-end-labeled  starting  library.  In  each  subsequent  selection  round,  se¬ 
lections  were  performed  with  approximately  0.1  ng  of  amplified  ^P-body-labeled 
DNA  In  selections  for  binding  to  partially  purified  bacterially  expressed  MyoD 
mutant  proteins  (Fig.  3),  these  protein  preparations  were  not  quantitated,  but 
instead  sequential  dilutions  of  these  samples  were  tested  for  binding.  Bound 
sequences  were  then  recovered  and  amplified  firom  a  sample  in  which  less  than 
10%  of  the  input  DNA  was  in  the  bound  fraction.  This  strategy  ensured  selection 
of  optimal  binding  sequences.  The  final  selected  binding  site  pool  was  sequenced 
directly,  using  a  ^^P-cnd-labeled  primer  as  described  previously  (7).  Mouse 
proteins  were  used  in  these  selections  with  the  exception  of  Twist,  vdiich  was 
homXenopus.  Binding  site  competition  analyses  (not  shown)  demonstrated  that 
its  binding  preferences  were  indistinguishable  from  those  of  mouse  Twist,  which 
was  used  in  the  EMSA  analyses  shown. 

RESULTS 

Myogenic  BR  residues  and  MyoD  DNA  binding  preferences. 
Identification  of  the  myogenic  BR  residues  stemmed  originally 
from  studies  in  which  the  MyoD  BR  was  replaced  with  that  of 
E12,  a  product  of  the  alternatively  spliced  E2A  gene  (40).  This 
MyoD  mutant  [MD(E12B)  (Fig.  2)]  binds  to  a  muscle-specific 
regulatory  site  as  a  heterodimer  with  E2A  proteins  ei^er  in 
vitro  or  in  vivo,  but  it  cannot  induce  myogenesis  in  a  cell 
culture  assay  or  activate  transcription  through  a  muscle-spe¬ 
cific  enhancer  (17,  57).  Resubstitution  of  the  myogenic  resi¬ 
dues  A5  and  Tg  (Fig.  2)  in  MD(E12B)  restores  its  activity  in 
these  functional  assays  (57).  Similar  results  are  obtained  when 
A5  and  Tg  are  mutated  within  MyoD  (18,  29,  57)  and  when 
analogous  substitutions  are  made  in  the  context  of  the  myo¬ 
genic  bHLH  protein  myogenin  (11).  These  experiments  impli¬ 
cate  A5  and  T5  in  mechanisms  that  are  of  functional  impor¬ 
tance  but  not  essential  for  binding  to  a  particular  muscle- 
specific  DNA  sequence. 

We  used  an  in  vitro  selection  strategy  (9)  to  test  whether 
such  mutations  might  have  more  subtle  effects  on  how  MyoD 
binds  specifically  to  DNA  To  identify  sequences  to  which 


VOL.  20,  2000 


AUTHOR: 

SEE  QUERY 
PAGE _ 

MYOGENIC  bHLH  RESIDUES  AND  DNA  SEQUENCE  RECOGNITION  3 


BASIC  REGIONS: 

HyoD 

E12 

Twist 


MUTANTS;. 

HD(E12B) 

MD(E12B-A) 

MD(E12B-AT) 

ND(E12BJ) 

E12(H0B) 

El2(riOBJ) 

E12(AT.  MDJ) 

E12(AT.  K) 

E12(AT) 


MUSCLE; 


KRKTTNAD 

QKAEREKE 

QSYEELQT 


[QKAEREKE 
[QKAEREKE 
[QKAEREKE 
[QKAEREKE 
Q  K  AFT  T  N  A  D 
Q  K  AfT  T  N  A  D 
QKAEREKE 
QKAEREKE 
QKAEREKE 


fiMV  R  D  I 
L  S-K-yJ 
L  S  -K-Y 
V  R  K  I 


Bas  I  c 


Junction 


+  +  +  + 
No 
ND 


No 

No 

+ 

No 
No 
+  + 
+ 

+ 

No 


FIG.  2.  Myogenic  activity  of  MyoD  and  E12  BR  and  junction  mutants.  Each  of  these  mutants  has  been  described  pr^ously  (18,  57),  and  their  sequences  are 
compared  with  sequences  from  mouse  MyoD,  E12,  and  Twist.  Amino  adds  that  are  identical  to  those  of  MyoD  are  underlined,  positions  that  are  consoled  in  most 
bHLH  proteins  are  shaded,  and  entire  BR  and  junction  regions  that  have  been  swapped  are  bracketed.  The  column  at  the  right  indicates  the  relative  actmties  of  these 
proteins  when  assayed  previously  by  transfection  for  conversion  of  cultured  cells  into  musde  (18, 57);  activity  is  denoted  as  + ++ +  (frequency  of  myogenic  conversion 
obtained  with  wfld-type  MyoD),  +  +  (30  to  50%  of  that  obtained  with  MyoD),  +  (5  to  30%  of  that  obtained  with  wUd-type  MyoD),  No  (myogenic  conversion  was  not 
detected),  or  ND  (not  done). 


these  mutants  bind  preferentially,  we  used  sequence  libraries 
in  which  only  positions  within  and  flanking  the  NTG 
consensus  are  randomized  (Fig.  3A),  so  that  the  position  of 
bHLH  protein  binding  along  the  DNA  is  fixed.  This  strategy 
makes  it  possible  to  sequence  the  selected  sites  as  a  pool  and 
thereby  to  analyze  a  very  large  population  of  selected  sites 
simultaneously  (8,  9).  It  reveals  the  relative  preferences  for 
individual  bases  at  each  site  position  and  can  detect  subtle 
differences  that  might  not  be  identified  through  more  conven¬ 
tional  approaches. 

This  assay  has  previously  shown  that  the  preferred  MyoD 
binding  consensus  is  (G/A)  ACAGCTG(T/C)  (Fig.  3B  and  C) 
and  that  the  E2A  proteins  E12  and  E47  overlap  considerably 
with  MyoD  in  their  binding  properties  but  prefer  sites  that 
have  an  asymmetric  CACCTG  core  sequence  (Fig.  3C)  (9). 
However,  in  contrast  to  either  of  these  proteins,  the 
MD(E12B)  mutant  prefers  the  sequence  (G/A)CCATAT: 
GG(T/C),  which  differs  from  the  MyoD  preferred  site  over  the 
eight  central  base  pairs  and  contains  the  distinct  core  sequence 
CAT  ATG  (Fig.  3B  and  C).  This  sequence  and  related  ele¬ 
ments  are  normally  targeted  by  the  bHLH  protein  Twist,  an 
E-protein  partner  that  is  involved  in  mesodermal  cell  fate 
specification  (15,  27,  37,  52,  60)  (Fig.  2).  Back-substitution  of 
A5  of  MyoD  into  MD(E12B),  which  is  not  sufficient  for  myo¬ 
genic  activity  in  cell  culture  assays  (57),  results  in  preferences 
that  are  slightly  more  similar  to  ^ose  of  MyoD  at  positions  ±4 
[MD(E12B-A)  (Fig.  2, 3B,  and  3C)].  However,  introduction  of 


both  A5  and  T^,  which  restores  myogenesis  (11, 57),  results  in 
preferences  across  the  entire  site  that  are  indistinguishable 
from  those  of  MyoD  [MD(E12B-AT)  (Fig.  2,  3B,  and  3C)]. 

To  determine  whether  these  sequence  preferences  reflect 
significant  differences  in  binding  affinity  and  specificity,  we 
compared  levels  of  binding  of  these  proteins  to  individual 
oligonudeotides  that  correspond  to  the  MyoD  and  Twist  pref¬ 
erences  and  differ  only  at  positions  within  and  adjacent  to  the 
CAN  NTG  consensus  (Fig.  3D).  Supporting  the  in  vitro  selec¬ 
tion  findings,  both  MyoD  and  MyoD(E12B-AT)  homodimers 
bound  with  higher  affinity  to  the  preferred  MyoD  site  than  to 
the  Twist  site  (Fig.  3D,  lanes  1,  4,  5,  and  8).  In  contrast,  the 
Twist  site  was  preferred  by  MD(E12B)  and,  to  a  lesser  extent, 
MyoD(E12B-A)  (Fig.  3D,  lanes  2,  3,  6,  and  7).  In  a  binding 
competition  assay,  spedfic  DNA  binding  by  MD(E12B-A'p 
was  competed  much  more  effectively  by  the  MyoD  site  (Fig. 
4A,  lanes  4,  7,  10,  13,  and  16),  and  binding  by  either 
MD(E12B)  or  MD(E12B-A)  was  competed  better  by  the  Twist 
site  (Fig.  4B,  lanes  2, 3, 8, 9, 14,  and  15).  A  c-Myc  preferred  site 
(CACGTG  [not  shown])  was  a  relatively  poor  competitor  of 
binding  by  each  of  these  proteins  (Fig.  4A  and  B,  lanes  17  to 
19).  The  data  show  that  introduction  of  A5  and  Tg  into 
MD(E12B)  restores  not  only  myogenic  activity  (Fig.  2)  but  also 
the  MyoD  DNA  binding  preference.  This  substitution  affects 
sequence  recognition  across  4  bp  within  each  half-site  (Fig.  3A 
and  B),  indicating  a  global  effect  on  how  the  BR  helix  is 
positioned  on  the  DNA  The  finding  that  MD(E12B)  is  distinct 


r 


AUTHOR: 
SEE  QUERY 


4  KOPHENGNAVONG  ET  AL. 


•  •  •• 

*112348 


|qCCTGA.T-^3'D3 

HcTGAT-®3*D6 


p  MD  MD  MD 

°  MyoD  1  (E12B)  t(E12B-AT)|  (E12B-A) 


uj  iu  iii  uj  uj 

o  S’  D  S’  o  S'  S’ 

s  s  s  s  z  s  s 


/'V- 


1  2  3  4  5  6  7  8 


AJ&GC&T  ACOtaSGT 


GATCIGATCIG  AT  C|G  ATC 

FIG  3  In  vitro  selection  assay  of  binding  site  preferences.  (A)  Core  sequences  of  the  random  sequence  oligonuclwtide  librari^ 

labeled  to  the  same  specific  activity  at  550  pM.  Specific  and  background  species  are  mdicated  by  open  and  dosed  triangles,  respectively. 


from  either  MyoD  or  E12  in  its  binding  sequence  preference 
also  indicates  that  DNA  recognition  1^  an  E2A  BR  can  be 
profoundfy  influenced  by  its  molecular  context. 

Influence  of  BR  positioning  on  h^sroD-E2A  and  Twist-B^ 
heterodimer  sequence  preferences.  Twist  and  E2A  proteins 
appear  to  cooperate  in  vivo  to  regulate  transcription  through 
CAT  ATG  sites  (27),  suggesting  that  the  DNA  sequence  rec¬ 
ognition  properties  of  E2A  might  be  altered  by  heterodimer¬ 
ization  with  Twist  However,  an  alternative  possibiiity  is  that 
functional  Twist-E2A  recognition  sites  are  distinct  from  their 
in  vitro  binding  preference  (28).  To  address  this  question,  we 
performed  in  vitro  selection  on  TvTOt-E12  complexes.  Twist 
homodimers  and  Twist-E12  heterodimers  both  preferred  sites 
that  contain  the  core  sequence  CAT  ATG  (Fig.  5A  and  B). 
They  were  similar  to  MD(E12B)  and  especially  to  NTO 
(E12B-A)  in  their  preferences  at  ±4  but  selected  MyoD-like 
sequences  at  ±5  (Fig.  3B  and  3C,  5A,  and  5B).  The  symmetry 
of  this  preferred  sequence  suggests  that  in  the  Twist-E12  pro- 
tein-DNA  complex,  the  Twist  and  E12  BRs  each  prefer  the 
same  half-site  sequence.  In  contrast,  and  as  observed  previ¬ 
ously  (9),  MyoD-E12  heterodimers  selected  a  MyoD-like  half 


site  at  positions  -i-4  and  +5,  an  E2A-like  half-site  at  4  and 
-5,  and  CC  or  GG  bases  in  the  center  of  the  site  (Fig.  5A  and 
B),’  indicating  asymmetric  binding.  Apparently,  an  E2A  BR 
normal^  prefers  distinct  half-sites  in  the  context  of  these  two 
bHLH  dimerization  partners,  indicating  an  intermolecular  ef¬ 
fect  on  how  it  interacts  spediScalfy  with  DNA. 

To  investigate  how  heterodimer  formation  influences  the 
binding  preferences  of  the  E12  and  MyoD  BRs,  we  performed 
in  vitro  selection  on  combinations  of  MyoD  and  E12  BR  mu¬ 
tants.  When  the  BR  of  one  partner  within  a  MyoD-E12  het¬ 
erodimer  was  substituted  with  that  of  the  other,  the  het¬ 
erodimer  binding  preferences  outside  the  CAN  NIS 
consensus  corresponded  to  those  of  the  individual  BRs.  For 
example,  unlike  MD(E12B)  homodimers  (Fig.  3B  and  C),  het¬ 
erodimers  of  MD(E12B)-l-E12  preferred^  wild-type  het¬ 
erodimer  sequences  in  the  center  of  the  site,  and  selected 
E2A-like  sequences  in  both  flanking  regions,  at  ±4  and  ±5 
(Fig  5A  and  B).  A  heterodimer  of  MyoD  and  an  E12  protein 
containing  the  MyoD  BR  [E12(MDB)  (Fig.  2A)]  similarly  se¬ 
lected  a  wild-type  heterodimer  preference  within  the  CAN 
NTG  motif  but  preferred  a  MyoD-like  sequence  at  ±4  and  ±5 
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(Fig.  5A  and  B).  In  contrast,  MD(E12B)-E12(MDB)  het¬ 
erodimers  had  a  binding  preference  more  similar  to  that  of 
TWist  (Fig.  5A  and  B),  indicating  that  placement  of  each  BR  in 
the  protein  context  of  the  other  partner  affected  binding  over 
the  entire  site.  A  striking  aspect  of  our  findings  is  that  each  of 


the  mutant  homo-  or  heterodimer  protein  complexes  that  we 
have  examined  selected  sequences  that  correspond  to  particu¬ 
lar  patterns  preferred  by  MyoD,  E2A,  or  Twist  protein  (Fig.  3C 
and  5B). 

These  in  vitro  selection  findings  were  supported  by  assays  ot 
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binding  to  individual  sites,  including  a  sequence  from  a  muscle- 
spedfic  regulatoiy  region  (MCK-R).  This  site,  which  corre¬ 
sponds  to  the  MyoD-E12  heterodimer  in  vitro  binding  prefer¬ 
ence  and  responds  to  MyoD  in  vivo,  was  used  in  the  original 
analysis  of  the  myogenic  residues  (9,  17,  57).  In  an  EMSA, 
MyoD-E12  heter^mers  bound  with  higher  affinity  to  either 
the  MCK-R  or  MyoD  site  than  to  the  Twist  site  (Fig.  5C,  lanes 
3,  12,  and  21).  MyoD(E12B)-E12  heterodimers  only  slightly 
preferred  the  MCK-R  heterodimer  site  to  the  Twist  site  but 


appeared  to  prefer  either  of  these  sequences  to  the  MyoD  site 
(Fig.  5C,  lanes  5,  14,  and  23).  As  the  preferences  of 
MD(E12B-A)  and  MD(E12B-AT)  homodimers  would  predid, 
introduction  of  both  A5  and  T5  into  MD(E12B)  altered  its 
sequence  preferences  as  a  heterodimer  with  E12,  so  that  they 
were  more  similar  to  those  of  MyoD  (not  shown).  MyoD- 
E12(MDB)  heterodimers  only  modestly  preferred  the  MyoD 
or  MCK-R  site  in  comparison  to  the  Twist  site  (Fig.  5C,  lanes 
4,  13,  and  22).  In  contrast,  the  Twist  site  was  preferred  by 
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MD(E12B)-E12(MDB),  Twist,  and  Twist-E12  complexes  (Fig. 
5C,  lanes  6,  8, 9, 15, 17, 18, 24,  26,  and  27). 

Binding  site  competition  and  protein  titration  assays  also 
supported  the  in  vitro  selection  data.  The  MyoD  site  competed 
more  effectively  than  the  Twist  site  for  binding  by  either  MyoD 
homodimers  or  MyoD-E12  heterodimers  (Fig.  6A  and  B,  lanes 
1  4,  7,  10,  13,  and  16).  In  contrast,  the  Twist  site  competed 
more  effectively  for  binding  by  MD(E12B),  MD(E12B)-E12, 
Twist,  and  Twist-E12  complexes,  although  these  latter  com¬ 
plexes  appeared  to  bind  with  less  specificity  than  did  MyoD- 
E12  complexes  (Fig.  6C  and  D,  lanes  2, 3, 5, 6, 8, 9, 11, 12, 14, 
15, 17,  and  18).  However,  the  distinct  binding  specificities  of 
MyoD-E12  and  Twist-E12  heterodimers  were  apparent  in  a 
protein  titration  assay  in  which  the  amount  of  MyoD  or  Twist 
protein  was  varied  under  conditions  of  low  DNA  concentration 


Jig.  7A  and  B,  lanes  1  to  6  and  13  to  18)  that  more  closely 
spresent  differences  in  binding  affini^  (13).  Also  in  agree- 
lent  with  results  described  above  (Fig.  5C,  lanes  14  and  23), 
eterodimers  of  MD(E12B)  plus  Ei2  bind  to  the  MCK-R  site 
ith  deaeased  specificity  and  with  slightly  lower  affinity  than 
IyoD-E12  complexes  (Fig.  7A  and  B,  lanes  7  to  12). 

To  investigate  the  role  of  the  BR-HLH  junction  region  in 
iR  positioning,  we  examined  the  DNA  binding  preferences  of 
le  MD(E12BJ)  and  E12(MDBJ)  mutants,  each  of  which  con- 
lins  both  the  BR  and  junction  of  the  other  partner  (Fig.  2).  In 
ontrast  to  MD(E12B)-E12(MDB)  heterodimers  (Fig.  5A  and 
t;  Fig.  5C,  lanes  6,  15,  and  24),  MD(E12BJ)-E12(MDBJ) 
eterodimers  (Fig.  2A)  bound  to  the  MyoD,  Twist,  and 
4CK-R  sites  with  relative  preferences  that  are  comparable  to 
liose  of  MD-E12  heterodimers  (Fig.  5C,  lanes  3, 7, 12, 16, 21, 


r 

8  KOPHENGNAVONG  ET  AL. 


AUTHOR:  ^ 

SEE  QUERY 
PAGElS _ 

Mol.  Cell.  Biol. 


A 


MD«E12  MD<E12B)+E12  TwUE12 


1234S6789  10  11  12  13  14  15  16  1718 


B 


MD^E12  MD(E12BKE12  Twi+E12 


1  2  3  4  5  6  7  8  9  10  11  12  13  1415  16  1718 


Twi  site  MCK-R  site 
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concentrations  (pioomolar)  shown  above  the  gel.  (B)  Binding  to  the  MCK-R  site,  analyzed  as  for  panel  A. 


and  25).  Apparently,  the  Twist-like  sequence  preference  re¬ 
sulting  from  simultaneous  mispairing  of  both  the  MyoD  and 
E12  BRs  (Fig.  5A  and  B)  can  be  corrected  by  matching  each  of 
these  BRs  with  the  corresponding  junction  region.  Similarly, 
and  in  contrast  to  MD(E12B)  homodimers,  MD(E12BJ)  ho¬ 
modimers  bind  to  the  MyoD,  Twist,  and  MCK-R  sites  with 
preferences  that  are  similar  to  those  of  E2A  proteins  (Fig.  8B 
and  C,  lane  20,  and  data  not  shown).  These  findings  indicate 
that  the  BR-HLH  junction  can  be  critical  for  establishing  the 
sequence  specificity  of  an  E2A  BR,  presumably  because  it 
influences  how  the  BR  is  positioned  on  the  DNA. 

Contributions  of  the  BR  and  junction  to  binding  afl^i^  and 
specificity.  It  has  been  shown  previously  that  introduction  of 
A5,T6,  and  either  the  junction  region  or  K15  of  MyoD  confers 
upon  E12  the  capacity  to  induce  myogenesis  (Fig,  2)  (18).  In 
the  MyoD-DNA  complex,  A5  and  T5  are  not  positioned  to 
allow  direct  protein-protein  contact  (Fig.  1)  (35),  but  we  have 
shown  that  they  are  critical  for  the  DNA  sequence  preferences 
of  MyoD,  apparently  because  they  affect  the  conformation  of 
the  BR-DNA  complex.  We  have  also  determined  that  the 
junction  region  can  influence  how  the  E2A  BR  binds  DNA. 
These  observations  suggest  the  possibility  that  the  capacity  for 
myogenesis  derives  entirely  from  the  conformation  of  the 
DNA-bound  MyoD  BR,  a  model  which  would  predict  that  the 
sequence  preference  of  each  of  these  bHLH  proteins  is  estab¬ 
lished  by  amino  acids  at  BR  positions  5,  6,  and  15.  We  have 
investigated  this  model  by  determining  how  individual  substi¬ 
tutions  at  these  positions,  which  have  been  shown  to  be  critical 
in  vivo,  influence  the  DNA  binding  preferences  of  MyoD. 

To  address  the  importance  of  the  MyoD  junction  region  for 
DNA  binding,  we  altered  MyoD  positions  14  and  15  (Fig,  8A) 
and  left  position  13  intact  because  it  is  not  required  for  the 
MyoD  sequence  preference  in  the  MD(E12B-AT)  mutant 
(Fig.  2  and  3C).  Substitution  of  alanine  for  S14,  which  does  not 
interact  with  DNA  (35),  increased  binding  affinity  [MD(AK) 
(Fig.  8A;  Fig.  8B  and  C,  lanes  4  and  5),  perhaps  by  stabilizing 


the  BR  helix.  The  preference  of  MD(AK)  for  the  MyoD  site 
was  not  substantially  altered  by  replacement  of  position  15  with 
alanine  [MD(AA)]  or  with  either  glutamic  acid  [MD(AD)]  or 
serine  [MD(AS)  and  MD(QS)],  which  correspond  to  residues 
from  E12  or  Twist,  respectively  (Fig.  8;  Fig.  8B  and  C,  lanes  5 
to  9).  The  relative  preferences  of  these  mutants  for  the  MyoD 
site  are  comparable  to  the  binding  preferences  of  other  pro¬ 
teins  that  were  confirmed  by  binding  competition  analysis  ^ig. 
4  and  6).  Apparently,  appropriatety  specific  DNA  binding  by 
MyoD  homodimers  is  not  impaired  by  a  variety  of  BR-HLH 
junction  substitutions,  including  nonconservative  mutations  of 
K15.  This  flexibility  contrasts  with  the  importance  of  the  junc¬ 
tion  region  for  positioning  the  E12  BR  and  with  the  require¬ 
ment  for  Ki5  for  myogenesis. 

To  investigate  the  role  of  BR  positions  5  and  6  in  a  neutral 
context,  we  first  substituted  alanine  for  two  nonconserved  BR 
residues  (MD-AAATA  [Fig.  8A])  that  are  not  predicted  to  be 
required  for  DNA  binding  (22,  35).  This  substitution  propor¬ 
tionally  increased  binding  to  both  sites  in  the  context  of  MyoD 
(MD-AAATA  [Fig.  8B  and  C,  lanes  10])  and  enhanced  spec¬ 
ificity  for  the  MyoD  site  in  the  context  of  MD(AA)  (Fig.  8A; 
Fig.  8B  and  C,  lanes  12).  Replacement  of  Tg  with  asparagine 
conferred  a  preference  for  the  Twist  site  (MD-AAANA  [Fig. 
8A;  Fig.  8B  and  C,  lanes  10  and  13]),  a  finding  that  parallels  the 
preferences  of  MD(E12B-AT)  and  MD(E12B-A)  (Fig.  3B  and 
C).  This  effect  was  not  diminished  by  various  BR-HLH  junc¬ 
tion  mutations  or  enhanced  by  presence  of  Twist  junction 
residues  (Fig.  8B  and  C,  lanes  13  to  17),  indicating  that  Ng  is 
the  most  important  of  these  residues  for  the  Twist  sequence 
preference.  To  test  whether  E2A  amino  acids  that  correspond 
to  the  three  myogenic  residues  could  specify  an  E2A-like  DNA 
binding  preference,  we  introduced  an  asparagine  at  BR  posi¬ 
tion  7  into  MD-AAANA  and  MD-AAANA(AD),  the  latter  of 
which  contains  the  Djg  residue  characteristic  of  E2A  proteins 
(Fig,  8A).  In  contrast  to  MD(E12BJ),  these  mutants  strongly 
preferred  the  Twist  site  to  the  MyoD  or  MCK-R  sites  (Fig.  8B 
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FIG.  9.  DNA  binding  by  E12  mutants.  DNA  binding  by  the  indicated  protein 
complexes  is  assayed  as  for  Fig.  5C  except  that  all  E12  derivatives  are  present  at 
8  pM  and  E47  is  present  at  19  pM.  A  protein-DNA  complex  of  intermediate 
mobility  that  coixesponds  to  E47-E12  heterodimers  is  indicated  by  an  asterisk, 
and  a  background  species  is  indicated  by  a  closed  triangle. 


lower  level  than  E12(MDB)  and  did  not  have  a  markedly 
increased  preference  for  either  the  MyoD  or  MCK-R  sites 
(Fig.  9,  lanes  4,  11,  and  18).  Heterodimerization  with  E47 
increased  the  relative  levels  with  which  E12(MDBJ)  bound  to 
the  MyoD  and  MCK-R  sites  (Fig.  9,  lanes  6,  7, 13, 14,  20,  and 
21)  but  also  did  not  identify  DNA  binding  effects  that  appear 
to  be  sufficient  to  account  for  the  different  functional  proper¬ 
ties  of  E12(MDB)  and  E12(MDBJ).  These  findings  further 
support  the  idea  that  the  MyoD  junction  region  is  not  critical 
for  DNA  binding  (Fig.  8B  and  C,  lanes  4  to  9)  and  instead  is 
important  for  myogenesis  because  it  is  involved  in  other  inter¬ 
actions  (18). 


DISCUSSION 

bHLH  protein  DNA  binding  specificity  deriving  from  effects 
on  BR-DNA  conformation.  The  myogenic  MyoD  BR  residues 
A5  and  T5  are  essential  for  myogenesis  but  not  for  binding  of 
MyoD-E2A  heterodimers  to  a  muscle-specific  site  in  vitro  or  in 
vivo  (18, 57).  However,  we  have  determined  that  these  residues 
are  required  for  MyoD  to  bind  DNA  with  its  characteristic 
specificity  for  particular  CAN  NTG  sites.  Substitution  of  as¬ 
paragine  for  Ts.  and  especially  for  both  A5  and  Tfi,  results  in 
MyoD  binding  preferentially  to  a  Twist  site  (Fig.  8B  and  C, 
lanes  10,  13,  and  18).  The  Twist-like  MD(E12B)  sequence 
preference  is  affected  partially  by  substitution  of  A5  for  the 
corresponding  asparagine  [MD(E12B-A)  (Fig.  3C)]  but  is  re¬ 
configured  by  introduction  of  both  A5  and  T^  so  that  it  is 
indistinguishable  from  that  of  wild-type  MyoD  [MD(E12B- 
AT),  Fig.  3C)].  The  data  indicate  that  MyoD  residues  A5  and 
T5  are  each  critical  for  its  DNA  binding  sequence  preferences 
and  that  the  Ng  residue,  which  is  common  to  the  Twist  and 
MD(E12B-A)  BRs  (Fig.  2),  is  important  for  the  Twist-like 
preference.  Mutations  of  these  individual  BR  residues  alter 
sequence  preferences  across  each  half-site  (Fig.  3C),  raising 
the  question  of  how  they  might  have  such  a  global  effect  on 
how  the  BR  helices  and  the  DNA  interact  preferentially  with 
each  other. 

A  structure  of  MyoD  obtained  by  X-ray  crystallography  sug¬ 
gests  how  A5  and  Tg  might  influence  binding  sequence  speci¬ 
ficity.  When  bound  to  its  preferred  recognition  site,  MyoD 


does  not  directly  contact  base  pairs  that  it  specifies  in  the 
center  of  and  flanking  the  CAN  NTG  consensus  (35).  How¬ 
ever,  A5  and  Tg  allow  the  MyoD  BR  helix  to  pack  more  tightly 
into  the  major  groove  than  do  the  corresponding  N5  and  Ng 
residues  of  E2A  proteins,  in  part  because  of  their  smaller  sizes 
(Fig.  1  and  2)  (35).  As  a  result,  the  MyoD  BR  residues  Tg  and 
R2  directly  contact  CAN  NTG  bases  at  ±2  and  ±3  respec¬ 
tively,  and  Ri  binds  a  backbone  phosphate  at  ±6  (Fig.  1)  (35). 
In  contrast,  in  E47  R2  swings  out  of  the  major  groove  and 
contacts  the  backbone,  and  the  residue  at  position  1  does  not 
interact  directly  with  the  DNA  (12,  19).  Supporting  the  idea 
that  A5  and  Tg  influence  the  conformation  of  the  DNA-bound 
BR,  substitution  of  asparagine  for  A5  in  MyoD  increases  its 
sensitivity  to  protease  digestion  (29).  Our  findings  suggest  that 
protein-DNA  interactions  that  depend  specifically  on  the 
MyoD  Ag  and  Tg  residues  may  directly  influence  how  the  BR 
helk  interacts  preferentially  with  the  DNA  and  thereby  indi¬ 
rectly  specify  its  characteristic  sequence  preferences  at  posi¬ 
tions  within  and  flanking  the  CAN  NTG  consensus. 

Such  indirect  conformational  effects  also  appear  to  be  crit¬ 
ical  for  the  E2A  and  Twist  sequence  preferences.  When  E47 
homodimers  bind  DNA,  a  single  subunit  contacts  a  base  in  the 
center  of  the  site  through  Rjo  (Fig.  2).  This  interaction  could 
be  important  for  the  asymmetric  E2A  homodimer  sequence 
preference  (19).  However,  the  Twist-like  sequence  preference 
that  is  characteristic  of  Twist-E2A  heterodimers  and 
MD(E12B)  homodimers  is  different  across  each  5-bp  half-site 
and  symmetric  (Fig.  3C  and  5B),  suggesting  that  it  is  likely  to 
be  established  indirectly,  through  an  intermolecular  effect  that 
involves  a  distinct  positioning  of  the  BR  helix.  Introduction  of 
the  E12  BR-HLH  junction  region  into  MD(E12B)  corrects  its 
binding  preference  so  it  is  like  that  of  E2A  homodimers 
[MD(E12BJ)  (Fig.  5C,  lanes  7, 16,  and  25;  Fig.  8B  and  C,  lanes 
20)],  implicating  the  BR-HLH  junction  in  this  effect.  Presum¬ 
ably,  the  E2A  junction  acts  in  concert  with  the  asparagines  at 
BR  positions  5  and  6  (Fig.  2),  although  the  Twist-like  prefer¬ 
ence  of  the  MD-AANNA(AD)  mutant  (Fig.  8B  and  C,  lane  19, 
and  data  not  shown)  suggests  that  the  E2A  junction  residue 
Di5  is  not  sufficient.  The  finding  that  E2A  proteins  can  be 
targeted  to  different  DNA  sequences  by  different  dimer  part¬ 
ners  may  have  important  implications  for  their  in  vivo  hxnc- 
tions. 

In  contrast,  the  BR-HLH  junction  region  does  not  have  a 
strong  influence  on  the  MyoD  DNA  binding  preference.  Var¬ 
ious  MyoD  junction  mutations  do  not  substantially  diminish  its 
preference  for  a  MyoD  site  (Fig.  8B  and  C,  lanes  5  to  9).  In 
addition,  the  similar  sequence  preferences  of  E12(MDB)  and 
E12(MDBJ)  homodimers  (Fig.  9,  lanes  3, 4, 10, 11, 17,  and  18) 
contrast  sharply  with  the  different  specificities  of  MI)(E12B) 
and  MD(E12BJ)  (Fig.  3D,  lanes  2  and  6;  Fig.  8B  and  C,  lanes 
20).  This  apparent  difference  between  MyoD  and  E2A  pro¬ 
teins  might  derive  from  the  distinct  arrangement  of  the  BR 
helix  on  the  DNA  that  results  from  presence  of  MyoD  residues 
A^  and  Tg. 

It  is  striking  that  as  a  group,  these  various  bHLH  mutants 
and  dimer  combinations  bind  DNA  with  a  limited  number  of 
discrete  sequence  preferences  (Figs.  3C  and  5B).  Presumably, 
each  of  these  preferences  reflects  a  preferred  conformational 
state  that  is  dictated  by  how  each  BR  helix  and  the  correspond¬ 
ing  DNA  sequence  conform  to  each  other  in  an  induced  fit 
(49).  This  mechanism  for  recognizing  particular  CAN  NTG 
sites  appears  to  be  different  from  the  direct  recognition  of 
central  bases  that  is  characteristic  of  bHLH  proteins  that  con¬ 
tain  Ri3  and  bind  to  CACGTG  or  CATGTG  sites  (20,  21,  48). 
Consistent  with  this  idea,  BR  residues  5  and  6  do  not  appear  to 
be  important  for  the  function  of  the  Rjj-containing  bHLH 
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protein  c-Myc  (10).  In  E2A  and  its  tissue-specific  dimerization 
partners,  a  more  flexible  conformation-based  mechanism 
might  have  evolved  to  increase  adaptability  in  both  sequence 
recognition  and  function,  so  that  different  combinations  of 
these  proteins  can  result  in  distinct  protein-DNA  conforma¬ 
tions  that  correspond  to  particular  DNA  sequence  preferen^s. 
Such  a  model  may  be  particularly  plausible  for  bHLH  proteins, 
because  folding  of  the  BR  into  an  a  helix  is  driven  by  its 
interaction  with  the  DNA  (2). 

BR-DNA  conformation,  DNA  binding  specificity,  and  myo- 
genesis.  The  observation  that  the  MyoD  junction  and  K15  are 
not  required  for  an  appropriate  DNA  binding  specificity  (Fig. 
8B  and  C,  lanes  6  to  9;  Fig,  9)  supports  the  model  that  K15  is 
involved  in  other  essential  interactions  (18).  However,  our 
experiments  also  pose  the  question  of  how  the  functional  im¬ 
portance  of  A5  and  Tg  might  be  related  to  their  effects  on  DNA 
recognition.  Of  the  MyoD  BR  mutants  that  we  have  analyzed, 
those  that  do  not  induce  myogenesis  bind  to  DNA  as  ho¬ 
modimers  with  a  Twist-like  preference  [MD(E12B)  and 
MD(E12B-A)  (Fig.  2  and  3C)].  Heterodimers  of  MD(E12B) 
with  E12  prefer  a  heterodimer  site  (Fig.  5B),  but  with  markecUy 
diminished  specificity  compared  to  MyoD-E12  dimers  (Fig, 
5C,  lanes  3,  5, 12, 14,  21,  and  23;  Fig.  6;  Fig.  7A  and  B,  lanes 
1  to  12).  This  finding  suggests  that  at  least  in  part,  A5  and  Tg 
may  be  significant  for  myogenesis  because  they  restrict  the 
DNA  binding  specificity  of  MyoD  and  other  myogenic  bl^H 
proteins,  so  that  they  are  less  likely  to  bind  inappropriate  sites. 
However,  other  observations  support  a  role  for  the  A5  and  Tg 
residues  in  protein-protein  interactions.  They  have  been  im¬ 
plicated  in  binding  to  other  proteins  off  the  DNA  (26, 38),  and 
evidence  indicates  that  they  are  required  for  activation  domain 
exposure  (5, 29, 57)  and  cooperative  DNA  binding  (3).  Finally, 
unlike  MyoD,  MD(E12B)  can  activate  transcription  of  a  re¬ 
porter  only  in  particular  cell  lines,  implicating  the  BR  in  pro¬ 
tein-protein  interactions  (57). 

In  light  of  evidence  that  A5  and  Tg  establish  the  conforma¬ 
tion  of  the  DNA-bound  BR,  it  is  an  attractive  model  that  this 
effect  might  influence  the  function  of  myogenic  bHLH  proteins 
directly,  by  affecting  their  interactions  with  other  proteins. 
Given  that  relatively  subtle  alterations  of  the  MyoD  BR  and 
junction  region  can  enhance  MyoD  DNA  binding  significantly 
[MD(AK)  and  MD(AAATA)  (Fig.  8B  and  C,  lanes  4,  5,  and 
10)],  it  appears  likely  that  cooperative  protein-protein  interac¬ 
tions  with  the  BR  and  junction  could  influence  binding  affinity. 
It  has  been  demonstrated  recently  that  MyoD  binds  coopera¬ 
tively  with  other  DNA  binding  proteins  to  a  particular  muscle- 
specific  promoter  (4).  The  E  box  sequences  through  which 
MyoD  activates  transcription  in  the  context  of  this  promoter 
can  differ  fi*om  those  that  it  binds  preferentially  in  vitro  (28), 
suggesting  that  DNA  sequence  recognition  may  be  influenced 
by  interactions  with  cooperating  proteins  in  vivo.  In  addition, 
interactions  with  cooperating  proteins  might  be  influenced  in 
turn  by  the  specificity  of  DNA  sequence  recognition,  as  sug¬ 
gested  by  evidence  that  for  MyoD  and  E  proteins,  the  choice 
between  homo-  or  heterodimer  formation  may  be  dictated  by 
the  DNA  binding  affinities  of  the  individual  BRs  (36,  59).  Our 
findings  are  consistent  with  the  idea  that  deceptively  subtle 
aspects  of  sequence  recognition  could  be  important  for  the 
biological  activity  of  MyoD,  if  they  influence  fiinctionally  crit¬ 
ical  interactions  that  might  also  involve  K15  or  other  MyoD 
regions. 
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